![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
阿里巴巴的人工智慧實驗室推出了新版本的 Qwen 語言模型,可以處理多達 100 萬個文字標記,相當於大約 10 部小說。
Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.
阿里巴巴的人工智慧實驗室推出了 Qwen 語言模型的新版本,能夠處理數量驚人的 100 萬個文字標記——大約相當於 10 部小說的價值。此外,該團隊成功地將處理速度提高了四倍。
Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.
Qwen 現已將 9 月推出的 Qwen2.5 語言模型的上下文長度從 128,000 個令牌擴展至 100 萬個令牌。這使得 Qwen2.5-Turbo 能夠處理十本完整的小說、150 小時的文字記錄或 30,000 行程式碼。
Perfect accuracy in retrieving numbers
檢索數字的完美準確性
The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.
該模型在密鑰檢索任務中的準確率達到 100%,該任務涉及在不相關文本的 100 萬個標記內查找隱藏數字,而不管資訊在文件中的位置如何。這似乎部分克服了「迷失在中間」的現象,其中語言模型主要關注提示的開頭和結尾。
In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.
在長文本理解的多項基準測試中,Qwen2.5-Turbo 的表現優於 GPT-4 和 GLM4-9B-1M 等競爭模型,同時保持與短序列 GPT-4o-mini 相當的表現。
A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen
Qwen 的一段螢幕錄製展示了新語言模型能夠快速概括劉慈欣完整的《三體世界》三部曲,該三部曲總共涵蓋 690,000 個 token。 |影片:Qwen
Sparse attention speeds up inference by 4.3x
稀疏注意力使推理速度提高 4.3 倍
By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.
透過採用稀疏注意力機制,Qwen 將處理 100 萬個 token 時到達第一個 token 的時間從 4.9 分鐘減少到 68 秒,速度提高了 4.3 倍。
The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.
成本仍為每 100 萬個代幣 0.3 元(4 美分)。在相同的價格點上,Qwen2.5-Turbo 現在可以處理的代幣數量是 GPT-4o-mini 的 3.6 倍。
Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.
Qwen2.5-Turbo 現在可以透過阿里雲 Model Studio 的 API 以及 HuggingFace 和 ModelScope 上的演示來存取。
How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images
DeepMind 的 Genie AI 如何透過影像生成互動式世界來重塑機器人技術
Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.
Qwen 指出,目前模型可能並不總是在現實應用中涉及長序列的任務上表現最佳。
There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.
仍有許多未解決的挑戰,包括模型在長序列下的性能較差以及阻礙更大模型部署的高推理成本。
In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.
未來,Qwen 的目標是進一步探索人類對長序列的偏好對齊,優化推理效率以減少計算時間,並將更大、更強大的長上下文模型推向市場。
Why bother with large context windows?
為什麼要使用大的上下文視窗呢?
The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.
近幾個月來,大型語言模型的上下文視窗一直在穩步增加。目前,實用標準已確定在128,000 (GPT-4o) 和200,000 (Claude 3.5 Sonnet) 代幣之間,儘管也有一些異常值,例如Gemini 1.5 Pro 的代幣數量高達1000 萬個,或Magic AI 的LTM- 2-mini 的代幣數量為1 億個。
While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.
雖然這些進步通常有助於大型語言模型的實用性,但研究一直質疑大型情境視窗與 RAG 系統相比的好處,在 RAG 系統中,附加資訊是從向量資料庫動態檢索的。
免責聲明:info@kdj.com
所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!
如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。
-
-
- 加密貨幣價格查看部分恢復
- 2025-03-19 22:31:00
- 加密貨幣價格看到比特幣(BTC)的下降略有恢復,漲幅為0.5%
-
-
- EOS網絡將自己重新品牌為Vaulta,將重點轉移到Web3銀行
- 2025-03-19 22:26:00
- 這種轉變標誌著向Web3銀行業務的決定性轉變,旨在將Defi與傳統的金融系統融為一體。
-
- 在揮發性加密景觀中解密XRP的不同路徑
- 2025-03-19 22:26:00
- 加密貨幣市場是一個恆定通量的領域,情緒可以迅速轉移,命運可以改變眼睛的眨眼。
-
-
-
-
- 數字資產運營平台Utila籌集了1800萬美元
- 2025-03-19 22:26:00
- ULA是一個數字資產運營平台,已在A輪融資中籌集了1800萬美元,以將多方計算(MPC)解決方案擴展到其錢包