$83486.942804 USD

0.51%

ethereum

$1942.951501 USD

1.96%

tether

$1.000040 USD

-0.01%

xrp

$2.311790 USD

1.03%

bnb

$615.076581 USD

-3.89%

solana

$126.406699 USD

0.83%

usd-coin

$1.000150 USD

0.03%

cardano

$0.715061 USD

0.83%

dogecoin

$0.167881 USD

-0.10%

tron

$0.229729 USD

2.10%

chainlink

$14.028689 USD

-1.06%

unus-sed-leo

$9.781092 USD

-0.41%

toncoin

$3.586497 USD

1.25%

stellar

$0.277540 USD

2.47%

hedera

$0.188848 USD

0.32%

加密货币新闻

用潜在的离散代币替换推理过程的一部分可改善大型语言模型推理

2025/03/20 02:44

当在结构化推理轨迹上明确训练时，LLMS已显示出显着改进，使其能够求解数学方程，推断逻辑结论并导航多步规划任务。但是，处理这些冗长的推理轨迹所需的计算资源是很大的。这项工作引入了一种新颖的技术，将离散的潜在代币整合到LLM推理中。

Large Language Models (LLMs) have shown remarkable improvements when explicitly trained on structured reasoning traces, enabling them to solve mathematical equations, infer logical conclusions, and perform multistep planning tasks. However, these models require significant computational resources to process lengthy reasoning traces. Researchers are actively exploring ways to enhance efficiency while maintaining the effectiveness of these models.

当对结构化推理轨迹进行明确训练，使其能够求解数学方程，推断逻辑结论并执行多步计划任务时，大型语言模型（LLM）已显示出显着的改进。但是，这些模型需要大量的计算资源来处理冗长的推理轨迹。研究人员正在积极探索提高效率的方法，同时保持这些模型的有效性。

One of the primary challenges in LLM reasoning is the high computational cost associated with training and inference. When models process step-by-step reasoning traces in natural language, much of the text is used to maintain coherence rather than contribute to reasoning. This leads to inefficient memory usage and increased processing time. Current methods aim to mitigate this issue by abstracting reasoning steps into compressed representations without losing critical information. However, models that attempt to internalize reasoning traces through continuous latent space or multi-stage training often perform worse than those trained with full reasoning details.

LLM推理的主要挑战之一是与培训和推理相关的高计算成本。当模型以自然语言处理逐步推理痕迹时，许多文本都用于保持连贯性而不是有助于推理。这导致记忆使用效率低下并增加了处理时间。当前的方法旨在通过将推理步骤抽象为压缩表示形式，而不会丢失关键信息。但是，试图通过连续的潜在空间或多阶段训练将推理痕迹内部化的模型通常比接受完整推理细节的训练的模型差不多。

Existing solutions have focused on reducing redundancy in reasoning traces by compressing intermediate steps. Some approaches use continuous latent representations, while others involve iterative reductions of reasoning sequences. However, these methods require complex training procedures and fail to maintain performance comparable to explicit textual reasoning. Researchers sought an alternative approach that reduces computational demands while preserving reasoning capabilities. To address this, they have introduced a method that replaces parts of the reasoning process with latent discrete tokens, achieving improved efficiency without sacrificing accuracy.

现有的解决方案的重点是通过压缩中间步骤来减少推理轨迹的冗余。一些方法使用连续的潜在表示，而另一些方法则涉及推理序列的迭代减少。但是，这些方法需要复杂的培训程序，并且无法保持与明确的文本推理相当的性能。研究人员寻求一种替代方法，可以减少计算需求，同时保留推理能力。为了解决这个问题，他们引入了一种方法，该方法用潜在的离散令牌取代推理过程的一部分，从而提高了效率而不牺牲准确性。

A research team from Meta AI and UC Berkeley proposed a novel technique that integrates discrete latent tokens into LLM reasoning. They employed a vector-quantized variational autoencoder (VQ-VAE) to convert a portion of the stepwise reasoning process into compact representations. The method involves replacing early reasoning steps with latent abstractions while retaining later steps in textual form. This hybrid representation ensures the model maintains interpretability while reducing the token length of reasoning sequences. The key innovation is the randomized mixing of latent and text tokens, which enables the model to adapt seamlessly to new reasoning structures without extensive retraining.

来自Meta AI和UC Berkeley的研究团队提出了一种新型技术，将离散的潜在代币整合到LLM推理中。他们采用了载体定量的变分自动编码器（VQ-VAE）将逐步推理过程的一部分转换为紧凑的表示。该方法涉及用潜在抽象替换早期的推理步骤，同时以文本形式保留以后的步骤。这种混合表示可确保该模型在减少推理序列的令牌长度的同时保持可解释性。关键创新是潜在和文本令牌的随机混合，这使该模型能够无缝地适应新的推理结构而无需进行大量重新训练。

The researchers developed a training strategy incorporating latent tokens into LLM reasoning traces. During training, a controlled number of reasoning steps are replaced with their corresponding latent representations, ensuring that the model learns to interpret both abstracted and explicit reasoning structures. The randomization of latent token replacements allows adaptability across different problem types, improving the model’s generalization ability. Limiting the number of textual reasoning steps reduces input size, making LLMs more computationally efficient while maintaining reasoning performance.

研究人员制定了一种培训策略，将潜在代币纳入LLM推理轨迹。在培训期间，由其相应的潜在表示替换了许多受控的推理步骤，以确保模型学会解释抽象和明确的推理结构。潜在令牌替换的随机化允许在不同的问题类型上进行适应性，从而提高了模型的泛化能力。限制文本推理步骤的数量会减少输入大小，从而使LLM在保持推理性能的同时更加有效。

Furthermore, the researchers ensured that the extended vocabulary, including newly introduced latent tokens, could be seamlessly integrated into the model without requiring major modifications. The proposed method demonstrated significant performance improvements across various benchmarks. The approach outperformed traditional chain-of-thought (CoT) models when applied to mathematical reasoning tasks. On the Math dataset, it achieved a 4.2% improvement over previous best-performing methods. In the GSM8K benchmark, the approach yielded a 4.1% gain, while in the Fresh-Gaokao-Math-2023 dataset, it outperformed existing models by 13.3%.

此外，研究人员确保可以将扩展的词汇（包括新引入的潜在代币）无缝地集成到模型中而无需进行重大修改。提出的方法显示了各种基准的性能改善。当应用于数学推理任务时，该方法的表现超过了传统的经营链（COT）模型。在数学数据集上，它比以前表现最好的方法提高了4.2％。在GSM8K基准中，该方法的增长率为4.1％，而在Fresh-Gaokao-Math-2023数据集中，它的表现优于现有模型的13.3％。

The reduction in reasoning trace length was equally noteworthy, with an average decrease of 17%, which resulted in faster inference times and lower memory consumption.

推理痕量长度的降低同样值得注意，平均降低17％，导致推理时间更快，记忆消耗较低。

Evaluations on logical reasoning datasets such as ProntoQA and ProsQA further validated the approach’s effectiveness, with accuracy improvements of 1.2% and 18.7%, respectively. The model achieved 100% accuracy on simpler reasoning tasks, demonstrating its capacity for efficient logical deduction.

对ProntoQA和ProsQA等逻辑推理数据集的评估进一步验证了该方法的有效性，精度提高了1.2％和18.7％。该模型在简单的推理任务上达到了100％的精度，证明了其有效逻辑扣除的能力。

The introduction of latent tokens has provided a significant step forward in optimizing LLM reasoning without compromising accuracy. By reducing the dependence on full-text reasoning sequences and leveraging discrete latent representations, the researchers have developed an approach that maintains efficiency while improving model generalization. The hybrid structure ensures that essential reasoning components are preserved, offering a practical solution to the challenge of balancing interpretability and computational efficiency. As LLMs continue to evolve, such methods may pave the way for more resource-efficient artificial intelligence systems that retain high levels of reasoning capability.

潜在代币的引入为优化LLM推理的前进提供了重要的一步，而不会损害准确性。通过减少对全文推理序列的依赖性并利用离散潜在表示，研究人员开发了一种方法，可以在改善模型概括的同时保持效率。混合结构确保保留基本的推理组件，从而为平衡可解释性和计算效率的挑战提供了实用的解决方案。随着LLM的不断发展，这种方法可能为保留高水平的推理能力的更多资源有效的人工智能系统铺平了道路。

免责声明:info@kdj.com

所提供的信息并非交易建议。根据本文提供的信息进行的任何投资，kdj.com不承担任何责任。加密货币具有高波动性，强烈建议您深入研究后，谨慎投资！

如您认为本网站上使用的内容侵犯了您的版权，请立即联系我们（info@kdj.com），我们将及时删除。

2025年03月20日发表的其他文章