|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KAIST AI 的研究人员推出了指令解码 (ID),这是一种无需更新参数即可增强指令调整 LM 的方法。
Instruction-tuned language models (LMs) generalize well to unseen tasks in a zero-shot setting. However, their performance on tasks outside their training data is often limited. Despite being built on large datasets and having billions of parameters, these LMs excel at In-Context Learning (ICL), where they can generate responses to a few examples without needing to be re-trained. However, the training dataset’s scope limits their effectiveness on unfamiliar tasks. Techniques like prompt engineering and output diversification can help improve performance but require significant effort. Recent research explores applying the cognitive anchoring effect to LMs, suggesting that emphasizing initial prompts can enhance task-specific responses and improve fidelity to instructions.
指令调整语言模型 (LM) 可以很好地泛化到零样本设置中未见过的任务。然而,他们在训练数据之外的任务上的表现往往受到限制。尽管建立在大型数据集上并拥有数十亿个参数,但这些语言模型在上下文学习(ICL)方面表现出色,它们可以生成对几个示例的响应,而无需重新训练。然而,训练数据集的范围限制了它们在不熟悉的任务上的有效性。快速工程和输出多样化等技术有助于提高性能,但需要付出巨大的努力。最近的研究探索了将认知锚定效应应用于 LM,表明强调初始提示可以增强特定任务的反应并提高对指令的保真度。
In this work, researchers from KAIST AI introduce Instructive Decoding (ID), a method that enhances instruction-tuned LMs without any parameter updates. Inspired by noisy supervision techniques, ID uses “noisy instructions,” which are altered versions of the original instructions, to create a contrastive approach for predicting the next token. By steering the model’s output in different directions, especially using “opposite” instructions, ID improves model performance across tasks. Experiments show significant gains in accuracy, with smaller models enhanced by ID outperforming larger ones. This method improves adherence to instructions and enhances overall response quality, demonstrating its effectiveness across various models and tasks.
在这项工作中,KAIST AI 的研究人员介绍了指令解码 (ID),这是一种无需任何参数更新即可增强指令调整 LM 的方法。受噪声监督技术的启发,ID 使用“噪声指令”(原始指令的更改版本)来创建预测下一个标记的对比方法。通过将模型的输出引导到不同的方向,特别是使用“相反”指令,ID 可以提高跨任务的模型性能。实验表明,通过 ID 增强的较小模型的准确度显着提高,其性能优于较大模型。该方法提高了对指令的遵守程度并提高了整体响应质量,证明了其在各种模型和任务中的有效性。
The goal of instruction-tuning is to fine-tune pre-trained LMs to better follow natural language instructions, which improves generalization to unseen tasks, especially in zero-shot scenarios. Expanding the variety and complexity of training tasks enhances this capability, although the models often rely heavily on pre-trained knowledge. Prior research highlights that LMs are sensitive to familiar instructions, even handling misleading ones, and this sensitivity can be leveraged through contrastive techniques. Contrast in text generation, like Contrastive Decoding, compares outputs from different models or inputs to improve performance. This study extends these ideas by using noisy instructions to boost generalization in instruction-tuned LMs.
指令调优的目标是微调预先训练的语言模型,使其更好地遵循自然语言指令,从而提高对未见过的任务的泛化能力,尤其是在零样本场景中。尽管模型通常严重依赖于预先训练的知识,但扩大训练任务的多样性和复杂性可以增强这种能力。先前的研究强调,语言模型对熟悉的指令很敏感,甚至处理误导性的指令,并且可以通过对比技术来利用这种敏感性。文本生成中的对比与对比解码一样,会比较不同模型或输入的输出以提高性能。这项研究通过使用噪声指令来增强指令调整 LM 的泛化能力,从而扩展了这些想法。
Instructive Decoding improves response generation in instruction-tuned models by contrasting outputs generated from noisy instructions. It builds on the anchoring effect, where initial information influences subsequent judgments and leverages differences between responses generated from original and altered instructions. The method uses noisy instruction variants like truncated, shuffled, or random words to mislead the model while ensuring task fidelity. By comparing logits from original and noisy instructions during decoding, Instructive Decoding helps models correct biases and produce responses more aligned with the intended instructions, refining their performance on unseen tasks.
指令解码通过对比噪声指令生成的输出来改进指令调整模型中的响应生成。它建立在锚定效应的基础上,其中初始信息影响随后的判断,并利用原始指令和更改指令生成的响应之间的差异。该方法使用噪声指令变体(例如截断、打乱或随机单词)来误导模型,同时确保任务保真度。通过在解码过程中比较原始指令和噪声指令的逻辑,指令解码可以帮助模型纠正偏差并产生与预期指令更加一致的响应,从而改善其在未见过的任务上的性能。
The experimental setup uses the SUPNATINST and UNNATINST datasets, evaluating models like Tk-Instruct, Alpaca, and T0 across tasks like Grammar Error Correction and Textual Entailment. Rouge-L, Exact Match (EM), Label Adherence (LA), and Label Coherence (LC) metrics assess performance. ID consistently improves results, especially for larger models like Tk-XXL, enhancing LA and LC. Interestingly, noisy instructions enhance output quality with ID despite baseline performance degradation. Though task-specific performance varies, the ‘opposite’ instruction variant proves robust across tasks. Overall, ID shows significant gains across model sizes and task types.
实验设置使用 SUPNATINST 和 UNNATINST 数据集,跨语法错误纠正和文本蕴涵等任务评估 Tk-Instruct、Alpaca 和 T0 等模型。 Rouge-L、精确匹配 (EM)、标签粘附性 (LA) 和标签一致性 (LC) 指标评估性能。 ID 持续改善结果,特别是对于 Tk-XXL 等较大模型,增强 LA 和 LC。有趣的是,尽管基线性能下降,但嘈杂的指令通过 ID 提高了输出质量。尽管特定任务的性能有所不同,但“相反”指令变体在不同任务中被证明是稳健的。总体而言,ID 在模型大小和任务类型方面显示出显着的增益。
The study investigates the challenges of unseen task generalization in instruction-tuned language models. The proposed method, ID, leverages the anchoring effect using “noisy” instructions to counteract inherent model biases. By contrasting predictions with those generated from altered instructions, ID enhances model performance, particularly with the “opposite” noisy variant, which deviates most from the original input. Empirical results show ID’s effectiveness across multiple tasks, with notable improvements in prediction diversity. The approach requires no additional parameter updates, making it a practical tool for improving instruction-following in language models.
该研究调查了指令调整语言模型中看不见的任务泛化的挑战。所提出的方法 ID 利用“噪声”指令的锚定效应来抵消固有的模型偏差。通过将预测与更改后的指令生成的预测进行对比,ID 可以增强模型性能,特别是对于“相反”的噪声变体,该变体与原始输入的偏差最大。实证结果表明 ID 在多个任务中的有效性,并且预测多样性显着提高。该方法不需要额外的参数更新,使其成为改善语言模型指令跟踪的实用工具。
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.
查看论文。这项研究的所有功劳都归功于该项目的研究人员。另外,不要忘记在 Twitter 上关注我们并加入我们的 Telegram 频道和 LinkedIn 群组。如果您喜欢我们的工作,您一定会喜欢我们的时事通讯。
Don’t Forget to join our 50k+ ML SubReddit
不要忘记加入我们超过 50k 的 ML SubReddit
免责声明:info@kdj.com
所提供的信息并非交易建议。根据本文提供的信息进行的任何投资,kdj.com不承担任何责任。加密货币具有高波动性,强烈建议您深入研究后,谨慎投资!
如您认为本网站上使用的内容侵犯了您的版权,请立即联系我们(info@kdj.com),我们将及时删除。
-
- ZHOA Token:市场上以币安创始人 CZ 命名的新型加密货币
- 2024-10-02 18:15:01
- 作为加密货币领域最知名的名字之一,CZ 现在已经以与他同名的数字资产加入市场。
-
- 丰田、松下和普利司通不再担任国际奥委会赞助商
- 2024-10-02 18:15:01
- 国际奥委会正在向其三大重量级日本赞助商——丰田、松下和普利司通——挥手告别,因为他们决定终止合同。
-
- Notcoin (NOT) 价格预测:不会很快达到 0.01 美元大关吗?
- 2024-10-02 18:15:01
- 在过去的两周里,Notcoin [NOT] 在每日时间范围内突破了下降楔形,取得了可观的收益。
-
- 随着该行业即将迎来复兴,目前必须购买的 3 种 DeFi 代币投资
- 2024-10-02 18:15:01
- 加密货币领域最具变革性的创新之一是去中心化金融(DeFi),该系统允许用户获得借贷等金融服务