$96934.534835 USD

-2.02%

ethereum

$3439.367750 USD

-2.35%

tether

$0.999681 USD

0.07%

xrp

$2.238914 USD

-2.27%

bnb

$669.556401 USD

-0.27%

solana

$190.864177 USD

-2.05%

dogecoin

$0.311661 USD

-2.46%

usd-coin

$0.999999 USD

-0.02%

cardano

$0.894203 USD

-0.20%

tron

$0.245501 USD

-3.65%

avalanche

$38.806473 USD

-1.79%

chainlink

$23.037219 USD

0.95%

toncoin

$5.399737 USD

2.73%

sui

$4.487167 USD

7.48%

shiba-inu

$0.000021 USD

-4.36%

暗号通貨のニュース記事

NVIDIA は、Apple によるより高速で優れた AI エクスペリエンスの構築を支援しています

2024/12/20 19:52

BGR リンクを通じて購入すると、当社はアフィリエイト手数料を獲得し、専門家の製品ラボのサポートに役立てることができます。 Apple と NVIDIA がコラボレーションの詳細を共有

Tech giants Apple and NVIDIA have joined forces to enhance the performance of Large Language Models (LLMs) by introducing a new text generation technique for AI.

テクノロジー大手の Apple と NVIDIA は、AI 用の新しいテキスト生成技術を導入することで大規模言語モデル (LLM) のパフォーマンスを強化するために提携しました。

According to Apple, accelerating LLM inference is a crucial ML research problem. This is because auto-regressive token generation is computationally expensive and relatively slow. As a result, improving inference efficiency can reduce latency for users.

Apple によれば、LLM 推論の高速化は ML 研究の重要な課題です。これは、自動回帰トークンの生成は計算コストが高く、比較的時間がかかるためです。結果として、推論効率を向上させることで、ユーザーの待ち時間を短縮できます。

In addition to ongoing efforts to accelerate inference on Apple silicon, the company has recently made significant progress in accelerating LLM inference for the NVIDIA GPUs widely used for production applications across the industry, the company writes in a research paper.

Apple シリコンでの推論を高速化する継続的な取り組みに加え、同社は最近、業界全体の実稼働アプリケーションに広く使用されている NVIDIA GPU の LLM 推論の高速化において大きな進歩を遂げたと研究論文で述べています。

Earlier this year, Apple published and open-sourced Recurrent Drafter (ReDrafter), which is a novel approach to speculative decoding that “achieves state of the art performance.”

今年の初め、Apple は Recurrent Drafter (ReDrafter) を公開し、オープンソース化しました。これは、「最先端のパフォーマンスを実現する」投機的デコーディングへの新しいアプローチです。

According to the company, ReDrafter uses an RNN draft model, and combines beam search with dynamic tree attention to speed up LLM token generation by up to 3.5 tokens per generation step for open source models, surpassing the performance of prior speculative decoding techniques.

同社によれば、ReDrafter は RNN ドラフトモデルを使用し、ビーム検索と動的ツリーアテンションを組み合わせて、オープンソースモデルの LLM トークン生成を生成ステップあたり最大 3.5 トークンまで高速化し、以前の投機的デコード技術のパフォーマンスを上回りました。

“In benchmarking a tens-of-billions parameter production model on NVIDIA GPUs, using the NVIDIA TensorRT-LLM inference acceleration framework with ReDrafter, we have seen 2.7x speed-up in generated tokens per second for greedy decoding,” Apple papers show.

Apple の論文には、「NVIDIA TensorRT-LLM 推論アクセラレーションフレームワークと ReDrafter を使用して、NVIDIA GPU で数百億のパラメータ生成モデルのベンチマークを行ったところ、貪欲なデコードで 1 秒あたり生成されるトークンの速度が 2.7 倍向上したことがわかりました」と記載されています。

With that, this technology could signifanctly reduce latency users may experience, while also using fewer GPUs and consuming less power.

これにより、このテクノロジーはユーザーが経験する可能性のある遅延を大幅に削減すると同時に、使用する GPU と消費電力を削減することができます。

ニュースソース：bgr.com

免責事項:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research！

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2024年12月21日に掲載されたその他の記事

もっと