$97009.718510 USD

-1.49%

ethereum

$3431.013215 USD

-2.10%

tether

$0.999784 USD

0.12%

xrp

$2.234594 USD

-2.15%

bnb

$670.193694 USD

0.25%

solana

$190.387718 USD

-2.31%

dogecoin

$0.311224 USD

-2.19%

usd-coin

$1.000029 USD

-0.01%

cardano

$0.889004 USD

-0.45%

tron

$0.245339 USD

-2.83%

avalanche

$38.746400 USD

-1.90%

chainlink

$22.960491 USD

0.04%

toncoin

$5.383968 USD

2.92%

sui

$4.474880 USD

6.47%

shiba-inu

$0.000021 USD

-4.64%

加密貨幣新聞文章

NVIDIA 正在幫助 Apple 建立更快、更好的 AI 體驗

2024/12/20 19:52

如果您透過 BGR 連結購買，我們可能會賺取附屬佣金，幫助支持我們的專家產品實驗室。 Apple 和 NVIDIA 分享了合作細節

Tech giants Apple and NVIDIA have joined forces to enhance the performance of Large Language Models (LLMs) by introducing a new text generation technique for AI.

科技巨頭 Apple 和 NVIDIA 聯手透過引入新的 AI 文字生成技術來增強大型語言模型 (LLM) 的效能。

According to Apple, accelerating LLM inference is a crucial ML research problem. This is because auto-regressive token generation is computationally expensive and relatively slow. As a result, improving inference efficiency can reduce latency for users.

Apple 表示，加速 LLM 推理是一個至關重要的 ML 研究問題。這是因為自回歸令牌產生的計算成本較高且相對較慢。因此，提高推理效率可以減少使用者的延遲。

In addition to ongoing efforts to accelerate inference on Apple silicon, the company has recently made significant progress in accelerating LLM inference for the NVIDIA GPUs widely used for production applications across the industry, the company writes in a research paper.

該公司在一份研究論文中寫道，除了持續努力加速 Apple 晶片上的推理之外，該公司最近還在加速廣泛用於整個行業生產應用的 NVIDIA GPU 的 LLM 推理方面取得了重大進展。

Earlier this year, Apple published and open-sourced Recurrent Drafter (ReDrafter), which is a novel approach to speculative decoding that “achieves state of the art performance.”

今年早些時候，Apple 發布並開源了 Recurrent Drafter (ReDrafter)，這是一種新穎的推測解碼方法，「實現了最先進的性能」。

According to the company, ReDrafter uses an RNN draft model, and combines beam search with dynamic tree attention to speed up LLM token generation by up to 3.5 tokens per generation step for open source models, surpassing the performance of prior speculative decoding techniques.

據該公司稱，ReDrafter 使用RNN 草案模型，並將波束搜尋與動態樹注意力相結合，將開源模型的LLM 令牌生成速度提高到每生成步驟3.5 個令牌，超越了先前的推測解碼技術的性能。

“In benchmarking a tens-of-billions parameter production model on NVIDIA GPUs, using the NVIDIA TensorRT-LLM inference acceleration framework with ReDrafter, we have seen 2.7x speed-up in generated tokens per second for greedy decoding,” Apple papers show.

蘋果論文顯示：“在 NVIDIA GPU 上對數百億個參數生產模型進行基準測試時，使用 NVIDIA TensorRT-LLM 推理加速框架和 ReDrafter，我們發現每秒生成的貪婪解碼令牌速度提高了 2.7 倍。”

With that, this technology could signifanctly reduce latency users may experience, while also using fewer GPUs and consuming less power.

這樣，該技術可以顯著減少用戶可能遇到的延遲，同時使用更少的 GPU 並消耗更少的電量。

新聞來源：bgr.com

免責聲明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research！

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2024年12月21日其他文章發表於