bitcoin
bitcoin

$97009.718510 USD

-1.49%

ethereum
ethereum

$3431.013215 USD

-2.10%

tether
tether

$0.999784 USD

0.12%

xrp
xrp

$2.234594 USD

-2.15%

bnb
bnb

$670.193694 USD

0.25%

solana
solana

$190.387718 USD

-2.31%

dogecoin
dogecoin

$0.311224 USD

-2.19%

usd-coin
usd-coin

$1.000029 USD

-0.01%

cardano
cardano

$0.889004 USD

-0.45%

tron
tron

$0.245339 USD

-2.83%

avalanche
avalanche

$38.746400 USD

-1.90%

chainlink
chainlink

$22.960491 USD

0.04%

toncoin
toncoin

$5.383968 USD

2.92%

sui
sui

$4.474880 USD

6.47%

shiba-inu
shiba-inu

$0.000021 USD

-4.64%

加密貨幣新聞文章

NVIDIA 正在幫助 Apple 建立更快、更好的 AI 體驗

2024/12/20 19:52

如果您透過 BGR 連結購買,我們可能會賺取附屬佣金,幫助支持我們的專家產品實驗室。 Apple 和 NVIDIA 分享了合作細節

NVIDIA 正在幫助 Apple 建立更快、更好的 AI 體驗

Tech giants Apple and NVIDIA have joined forces to enhance the performance of Large Language Models (LLMs) by introducing a new text generation technique for AI.

科技巨頭 Apple 和 NVIDIA 聯手透過引入新的 AI 文字生成技術來增強大型語言模型 (LLM) 的效能。

According to Apple, accelerating LLM inference is a crucial ML research problem. This is because auto-regressive token generation is computationally expensive and relatively slow. As a result, improving inference efficiency can reduce latency for users.

Apple 表示,加速 LLM 推理是一個至關重要的 ML 研究問題。這是因為自回歸令牌產生的計算成本較高且相對較慢。因此,提高推理效率可以減少使用者的延遲。

In addition to ongoing efforts to accelerate inference on Apple silicon, the company has recently made significant progress in accelerating LLM inference for the NVIDIA GPUs widely used for production applications across the industry, the company writes in a research paper.

該公司在一份研究論文中寫道,除了持續努力加速 Apple 晶片上的推理之外,該公司最近還在加速廣泛用於整個行業生產應用的 NVIDIA GPU 的 LLM 推理方面取得了重大進展。

Earlier this year, Apple published and open-sourced Recurrent Drafter (ReDrafter), which is a novel approach to speculative decoding that “achieves state of the art performance.”

今年早些時候,Apple 發布並開源了 Recurrent Drafter (ReDrafter),這是一種新穎的推測解碼方法,「實現了最先進的性能」。

According to the company, ReDrafter uses an RNN draft model, and combines beam search with dynamic tree attention to speed up LLM token generation by up to 3.5 tokens per generation step for open source models, surpassing the performance of prior speculative decoding techniques.

據該公司稱,ReDrafter 使用RNN 草案模型,並將波束搜尋與動態樹注意力相結合,將開源模型的LLM 令牌生成速度提高到每生成步驟3.5 個令牌,超越了先前的推測解碼技術的性能。

“In benchmarking a tens-of-billions parameter production model on NVIDIA GPUs, using the NVIDIA TensorRT-LLM inference acceleration framework with ReDrafter, we have seen 2.7x speed-up in generated tokens per second for greedy decoding,” Apple papers show.

蘋果論文顯示:“在 NVIDIA GPU 上對數百億個參數生產模型進行基準測試時,使用 NVIDIA TensorRT-LLM 推理加速框架和 ReDrafter,我們發現每秒生成的貪婪解碼令牌速度提高了 2.7 倍。”

With that, this technology could signifanctly reduce latency users may experience, while also using fewer GPUs and consuming less power.

這樣,該技術可以顯著減少用戶可能遇到的延遲,同時使用更少的 GPU 並消耗更少的電量。

新聞來源:bgr.com

免責聲明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2024年12月21日 其他文章發表於