bitcoin
bitcoin

$97015.823745 USD

-1.78%

ethereum
ethereum

$3442.855173 USD

-1.96%

tether
tether

$0.999710 USD

0.06%

xrp
xrp

$2.240628 USD

-2.05%

bnb
bnb

$669.646011 USD

-0.32%

solana
solana

$191.281467 USD

-2.26%

dogecoin
dogecoin

$0.311841 USD

-2.59%

usd-coin
usd-coin

$1.000223 USD

-0.03%

cardano
cardano

$0.896151 USD

-0.01%

tron
tron

$0.245592 USD

-3.64%

avalanche
avalanche

$38.841004 USD

-1.73%

chainlink
chainlink

$23.050943 USD

1.16%

toncoin
toncoin

$5.404588 USD

2.70%

sui
sui

$4.489155 USD

7.57%

shiba-inu
shiba-inu

$0.000021 USD

-4.21%

加密货币新闻

NVIDIA 正在帮助 Apple 构建更快、更好的 AI 体验

2024/12/20 19:52

如果您通过 BGR 链接购买,我们可能会赚取附属佣金,帮助支持我们的专家产品实验室。 Apple 和 NVIDIA 分享了合作细节

NVIDIA 正在帮助 Apple 构建更快、更好的 AI 体验

Tech giants Apple and NVIDIA have joined forces to enhance the performance of Large Language Models (LLMs) by introducing a new text generation technique for AI.

科技巨头 Apple 和 NVIDIA 联手通过引入新的 AI 文本生成技术来增强大型语言模型 (LLM) 的性能。

According to Apple, accelerating LLM inference is a crucial ML research problem. This is because auto-regressive token generation is computationally expensive and relatively slow. As a result, improving inference efficiency can reduce latency for users.

Apple 表示,加速 LLM 推理是一个至关重要的 ML 研究问题。这是因为自回归令牌生成的计算成本较高且相对较慢。因此,提高推理效率可以减少用户的延迟。

In addition to ongoing efforts to accelerate inference on Apple silicon, the company has recently made significant progress in accelerating LLM inference for the NVIDIA GPUs widely used for production applications across the industry, the company writes in a research paper.

该公司在一份研究论文中写道,除了持续努力加速 Apple 芯片上的推理之外,该公司最近还在加速广泛用于整个行业生产应用的 NVIDIA GPU 的 LLM 推理方面取得了重大进展。

Earlier this year, Apple published and open-sourced Recurrent Drafter (ReDrafter), which is a novel approach to speculative decoding that “achieves state of the art performance.”

今年早些时候,Apple 发布并开源了 Recurrent Drafter (ReDrafter),这是一种新颖的推测解码方法,“实现了最先进的性能”。

According to the company, ReDrafter uses an RNN draft model, and combines beam search with dynamic tree attention to speed up LLM token generation by up to 3.5 tokens per generation step for open source models, surpassing the performance of prior speculative decoding techniques.

据该公司称,ReDrafter 使用 RNN 草案模型,并将波束搜索与动态树注意力相结合,将开源模型的 LLM 令牌生成速度提高到每生成步 3.5 个令牌,超越了之前的推测解码技术的性能。

“In benchmarking a tens-of-billions parameter production model on NVIDIA GPUs, using the NVIDIA TensorRT-LLM inference acceleration framework with ReDrafter, we have seen 2.7x speed-up in generated tokens per second for greedy decoding,” Apple papers show.

苹果论文显示:“在 NVIDIA GPU 上对数百亿个参数生产模型进行基准测试时,使用 NVIDIA TensorRT-LLM 推理加速框架和 ReDrafter,我们发现每秒生成的贪婪解码令牌速度提高了 2.7 倍。”

With that, this technology could signifanctly reduce latency users may experience, while also using fewer GPUs and consuming less power.

这样,该技术可以显着减少用户可能遇到的延迟,同时使用更少的 GPU 并消耗更少的电量。

新闻来源:bgr.com

免责声明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2024年12月21日 发表的其他文章