市值: $2.883T 2.390%
體積(24小時): $173.753B -0.120%
  • 市值: $2.883T 2.390%
  • 體積(24小時): $173.753B -0.120%
  • 恐懼與貪婪指數:
  • 市值: $2.883T 2.390%
Cryptos
主題
Cryptospedia
資訊
CryptosTopics
影片
Top News
Cryptos
主題
Cryptospedia
資訊
CryptosTopics
影片
bitcoin
bitcoin

$83666.044617 USD

-8.76%

ethereum
ethereum

$2091.944091 USD

-11.63%

tether
tether

$0.999558 USD

-0.02%

xrp
xrp

$2.319688 USD

-12.64%

bnb
bnb

$563.625816 USD

-6.10%

solana
solana

$136.566716 USD

-15.32%

usd-coin
usd-coin

$0.999829 USD

0.00%

dogecoin
dogecoin

$0.192157 USD

-12.05%

cardano
cardano

$0.807339 USD

-19.23%

tron
tron

$0.232527 USD

-2.68%

pi
pi

$1.767751 USD

7.51%

hedera
hedera

$0.225984 USD

-9.41%

unus-sed-leo
unus-sed-leo

$9.939243 USD

-0.10%

chainlink
chainlink

$13.904662 USD

-14.14%

stellar
stellar

$0.283124 USD

-14.81%

加密貨幣新聞文章

CPU 成為小型生成式 AI 模型的競爭者

2024/05/01 19:24

基於 CPU 的生成式人工智慧:英特爾和 Ampere 認為他們的晶片可以處理更小的模型。優化和硬體進步減少了與純 CPU 人工智慧相關的效能損失。 Intel 的 Granite Rapids Xeon 6 和 Ampere 的 Altra CPU 在小型 LLM 中展示了可喜的結果。由於記憶體和運算瓶頸,CPU 可能無法取代 GPU 用於較大的模型,但它們在處理較小模型的企業應用程式中顯示出潛力。

CPU 成為小型生成式 AI 模型的競爭者

CPUs Emerge as Viable Option for Running Small Generative AI Models

CPU 成為運行小型生成式 AI 模型的可行選擇

Amidst the proliferation of generative AI chatbots like ChatGPT and Gemini, discussions have centered on their dependence on high-performance computing resources such as GPUs and dedicated accelerators. However, recent advancements in CPU technology are challenging this paradigm, suggesting that CPUs can effectively handle smaller generative AI models.

隨著 ChatGPT 和 Gemini 等生成式 AI 聊天機器人的激增,討論的焦點集中在它們對 GPU 和專用加速器等高效能運算資源的依賴上。然而,CPU 技術的最新進展正在挑戰這種範式,這表明 CPU 可以有效地處理較小的生成式 AI 模型。

Performance Enhancements through Software Optimizations and Hardware Improvements

透過軟體優化和硬體改進提高效能

Traditionally, running large language models (LLMs) on CPU cores has been hampered by slower performance. However, ongoing software optimizations and hardware enhancements are bridging this performance gap.

傳統上,在 CPU 核心上運行大型語言模型 (LLM) 一直受到效能下降的阻礙。然而,持續的軟體優化和硬體增強正在縮小這一效能差距。

Intel has showcased promising results with its upcoming Granite Rapids Xeon 6 processor, demonstrating the ability to run Meta's Llama2-70B model at 82 milliseconds (ms) of second token latency, a significant improvement over its previous Xeon processors. Oracle has also reported impressive performance running the Llama2-7B model on Ampere's Altra CPUs, achieving throughput ranging from 33 to 119 tokens per second.

英特爾展示了即將推出的Granite Rapids Xeon 6 處理器的可喜成果,展示了以82 毫秒(ms) 秒令牌延遲運行Meta 的Llama2-70B 模型的能力,比之前的Xeon 處理器有了顯著改進。 Oracle 還報告了在 Ampere 的 Altra CPU 上運行 Llama2-7B 模型的令人印象深刻的性能,實現了每秒 33 到 119 個令牌的吞吐量。

Customizations and Collaborations Enhance Performance

客製化和協作提高性能

These performance gains are attributed to custom software libraries and optimizations made in collaboration with Oracle. Intel and Oracle have subsequently shared performance data for Meta's newly launched Llama3 models, which exhibit similar characteristics.

這些效能提升歸功於與 Oracle 合作進行的客製化軟體庫和優化。英特爾和甲骨文隨後共享了 Meta 新推出的 Llama3 模型的性能數據,該模型表現出類似的特徵。

Suitability for Small Models and Potential for Modestly Sized Models

小型模型的適用性和中等尺寸模型的潛力

Based on the available performance data, CPUs have emerged as a viable option for running small generative AI models. It is anticipated that CPUs may soon be capable of handling modestly sized models, especially at lower batch sizes.

根據可用的效能數據,CPU 已成為運行小型生成式 AI 模型的可行選擇。預計 CPU 很快就能處理中等大小的模型,特別是在較小的批次大小下。

Persistent Bottlenecks Limit Replaceability of GPUs and Accelerators for Larger Models

持續存在的瓶頸限制了大型模型 GPU 和加速器的可替換性

While CPUs demonstrate improved performance for generative AI workloads, it is important to note that various compute and memory bottlenecks prevent them from fully replacing GPUs or dedicated accelerators for larger models. For state-of-the-art generative AI models, specialized products like Intel's Gaudi accelerator are still necessary.

雖然 CPU 在生成型 AI 工作負載方面表現出更高的效能,但值得注意的是,各種運算和記憶體瓶頸阻止它們完全取代 GPU 或大型模型的專用加速器。對於最先進的生成人工智慧模型,像英特爾的 Gaudi 加速器這樣的專業產品仍然是必要的。

Overcoming Memory Limitations through Innovative Technologies

透過創新技術克服記憶體限制

Unlike GPUs, CPUs rely on less expensive and more capacious DRAM modules for memory, which presents a significant advantage for running large models. However, CPUs are constrained by limited memory bandwidth compared to GPUs with HBM modules.

與 GPU 不同,CPU 依靠更便宜、容量更大的 DRAM 模組作為內存,這為運行大型車型提供了顯著的優勢。然而,與具有 HBM 模組的 GPU 相比,CPU 受到記憶體頻寬的限制。

Intel's Granite Rapids Xeon 6 platform addresses this limitation with the introduction of Multiplexer Combined Rank (MCR) DIMMs, which facilitate much faster memory access. This technology, combined with Intel's enhanced AMX engine, doubles the effective performance and reduces model footprint and memory requirements.

英特爾的 Granite Rapids Xeon 6 平台透過引入多工器組合列 (MCR) DIMM 解決了這一限制,從而加快了記憶體存取速度。該技術與英特爾增強型 AMX 引擎結合,可將有效性提高一倍,並減少模型佔用空間和記憶體需求。

Balanced Approach to AI Capability Optimization

AI 能力優化的平衡方法

CPU designers face the challenge of optimizing their products for a wide range of AI models. Instead of prioritizing the ability to run the most demanding LLMs, vendors focus on identifying the distribution of models and targeting enterprise-grade workloads.

CPU 設計人員面臨著針對各種 AI 模型優化其產品的挑戰。供應商並沒有優先考慮運行要求最嚴格的法學碩士的能力,而是專注於確定模型的分佈並針對企業級工作負載。

Data from both Intel and Ampere suggests that the sweet spot for AI models in the current market lies within the 7-13 billion parameter range. These models are expected to remain mainstream, while frontier models may continue to grow in size at a slower pace.

來自英特爾和 Ampere 的數據表明,當前市場中人工智慧模型的最佳點位於 70-130 億個參數範圍內。這些模型預計將保持主流,而前沿模型的規模可能會繼續以較慢的速度成長。

Competitive Performance Against GPUs at Low Batch Sizes

在小批量大小下與 GPU 相比具有競爭力的效能

Ampere's testing revealed competitive performance between its CPUs and Arm CPUs from AWS and Nvidia's A10 GPU for small batch sizes. However, GPUs gain an advantage at higher batch sizes due to their massive compute capacity.

Ampere 的測試表明,其 CPU 與 AWS 的 Arm CPU 以及 Nvidia 的 A10 GPU 在小批量處理方面具有競爭性的性能。然而,GPU 由於其巨大的運算能力,在更高的批量大小方面具有優勢。

Nonetheless, Ampere argues that the scalability of CPUs makes them more suitable for enterprise environments where the need for large-scale parallel processing is less common.

儘管如此,Ampere 認為 CPU 的可擴展性使它們更適合大規模並行處理需求較不常見的企業環境。

Conclusion

結論

As generative AI technology evolves, CPUs are emerging as a viable option for running small and potentially modestly sized models, thanks to ongoing performance enhancements and innovative memory solutions. While GPUs and dedicated accelerators remain essential for larger models, CPUs are poised to play a significant role in the practical deployment of AI solutions for enterprise applications.

隨著生成式 AI 技術的發展,由於持續的效能增強和創新的記憶體解決方案,CPU 正在成為運行小型和中等大小模型的可行選擇。雖然 GPU 和專用加速器對於大型模型仍然至關重要,但 CPU 預計將在企業應用人工智慧解決方案的實際部署中發揮重要作用。

免責聲明:info@kdj.com

所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!

如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。

2025年03月05日 其他文章發表於