bitcoin
bitcoin

$93589.436630 USD

1.21%

ethereum
ethereum

$3338.533591 USD

-0.06%

tether
tether

$0.997790 USD

-0.06%

xrp
xrp

$2.132053 USD

5.64%

bnb
bnb

$704.161917 USD

0.78%

solana
solana

$189.760066 USD

0.63%

dogecoin
dogecoin

$0.317295 USD

2.15%

usd-coin
usd-coin

$1.000019 USD

0.01%

cardano
cardano

$0.855107 USD

1.33%

tron
tron

$0.255179 USD

1.23%

avalanche
avalanche

$35.471492 USD

0.98%

toncoin
toncoin

$5.483642 USD

-0.04%

chainlink
chainlink

$20.015558 USD

-1.29%

sui
sui

$4.173233 USD

1.46%

shiba-inu
shiba-inu

$0.000021 USD

2.57%

加密貨幣新聞文章

TALE:令牌預算感知 LLM 推理框架

2024/12/29 16:22

大型語言模型 (LLM) 在推理任務中顯示出巨大的潛力,使用思想鏈 (CoT) 等方法將複雜的問題分解為可管理的步驟。然而,這種能力也伴隨著挑戰。 CoT 提示通常會增加代幣使用量,從而導致更高的計算成本和能源消耗。對於同時需要精確度和資源效率的應用程式來說,這種低效率是一個問題。目前的法學碩士往往會產生不必要的冗長輸出,這並不總是能轉化為更好的準確性,還會產生額外的成本。關鍵的挑戰是在推理性能和資源效率之間找到平衡。

TALE:令牌預算感知 LLM 推理框架

A recent development in the field of artificial intelligence (AI) aims to address the excessive token usage and high computational costs associated with Chain-of-Thought (CoT) prompting methods for Large Language Models (LLMs). A team of researchers from Nanjing University, Rutgers University, and UMass Amherst have proposed a novel Token-Budget-Aware LLM Reasoning Framework to optimize token efficiency.

人工智慧 (AI) 領域的最新發展旨在解決與大型語言模型 (LLM) 的思想鏈 (CoT) 提示方法相關的過度代幣使用和高計算成本問題。來自南京大學、羅格斯大學和麻省大學阿默斯特分校的研究人員團隊提出了一種新穎的令牌預算感知法學碩士推理框架來優化令牌效率。

The framework, named TALE (standing for Token-Budget-Aware LLM rEasoning), operates in two primary stages: budget estimation and token-budget-aware reasoning. Initially, TALE employs techniques like zero-shot prediction or regression-based estimators to assess the complexity of a reasoning task and derive an appropriate token budget. This budget is then seamlessly integrated into the CoT prompt, guiding the LLM to generate concise yet accurate responses.

該框架名為 TALE(代表代幣預算感知 LLM 推理),分兩個主要階段運行:預算估算和代幣預算感知推理。最初,TALE 採用零樣本預測或基於迴歸的估計器等技術來評估推理任務的複雜性並得出適當的代幣預算。然後,該預算將無縫整合到 CoT 提示中,指導法學碩士產生簡潔而準確的答案。

A key innovation within TALE is the concept of “Token Elasticity,” which identifies an optimal range of token budgets that minimizes token usage while preserving accuracy. By leveraging iterative search techniques like binary search, TALE can pinpoint the optimal budget for various tasks and LLM architectures. On average, the framework achieves a remarkable 68.64% reduction in token usage with less than a 5% decrease in accuracy, highlighting its effectiveness and practicality for token efficiency.

TALE 的一項關鍵創新是「代幣彈性」的概念,它確定了代幣預算的最佳範圍,可以最大限度地減少代幣使用,同時保持準確性。透過利用二分搜尋等迭代搜尋技術,TALE 可以為各種任務和 LLM 架構確定最佳預算。平均而言,該框架顯著減少了 68.64% 的代幣使用量,而準確率下降了不到 5%,凸顯了其代幣效率的有效性和實用性。

Experiments conducted on standard benchmarks, such as GSM8K and MathBench, showcase TALE's broad applicability and efficiency gains. For instance, on the GSM8K dataset, TALE achieved an impressive 84.46% accuracy, surpassing the Vanilla CoT method while simultaneously reducing token costs from 318.10 to 77.26 on average. When applied to the GSM8K-Zero setting, TALE achieved a stunning 91% reduction in token costs, all while maintaining an accuracy of 98.72%.

在 GSM8K 和 MathBench 等標準基準上進行的實驗展示了 TALE 的廣泛適用性和效率提升。例如,在 GSM8K 資料集上,TALE 達到了令人印象深刻的 84.46% 準確率,超越了 Vanilla CoT 方法,同時將代幣成本平均從 318.10 降低到 77.26。當應用於 GSM8K-Zero 設定時,TALE 實現了令牌成本驚人的 91% 降低,同時保持了 98.72% 的準確度。

Furthermore, TALE demonstrates strong generalizability across different LLMs, including GPT-4o-mini and Yi-lightning. When employed on the MathBench-College dataset, TALE achieved reductions in token costs of up to 70% while maintaining competitive accuracy. Notably, the framework also leads to significant reductions in operational expenses, cutting costs by 59% on average compared to Vanilla CoT. These results underscore TALE's capability to enhance efficiency without sacrificing performance, making it suitable for a diverse range of applications.

此外,TALE 在不同的法學碩士(包括 GPT-4o-mini 和 Yi-lightning)中表現出強大的通用性。在 MathBench-College 資料集上使用時,TALE 可將代幣成本降低高達 70%,同時保持有競爭力的準確性。值得注意的是,該框架還顯著降低了營運費用,與 Vanilla CoT 相比,平均成本降低了 59%。這些結果強調了 TALE 在不犧牲效能的情況下提高效率的能力,使其適合各種應用。

In conclusion, the Token-Budget-Aware LLM Reasoning Framework offers a practical solution to the inefficiency of token usage in reasoning tasks. By dynamically estimating and applying token budgets, TALE strikes a crucial balance between accuracy and cost-effectiveness. This approach ultimately reduces computational expenses and broadens the accessibility of advanced LLM capabilities. As AI continues to rapidly evolve, frameworks like TALE pave the way for more efficient and sustainable use of LLMs in both academic and industrial settings.

總之,令牌預算感知 LLM 推理框架為推理任務中令牌使用效率低下的問題提供了實用的解決方案。透過動態估算和應用代幣預算,TALE 在準確性和成本效益之間取得了至關重要的平衡。這種方法最終減少了計算費用並擴大了高級法學碩士能力的可及性。隨著人工智慧的不斷快速發展,像 TALE 這樣的框架為在學術和工業環境中更有效和可持續地使用法學碩士鋪平了道路。

新聞來源:www.marktechpost.com

免責聲明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2025年01月01日 其他文章發表於