|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
阿里巴巴的人工智慧實驗室推出了新版本的 Qwen 語言模型,可以處理多達 100 萬個文字標記,相當於大約 10 部小說。
Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.
阿里巴巴的人工智慧實驗室推出了 Qwen 語言模型的新版本,能夠處理數量驚人的 100 萬個文字標記——大約相當於 10 部小說的價值。此外,該團隊成功地將處理速度提高了四倍。
Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.
Qwen 現已將 9 月推出的 Qwen2.5 語言模型的上下文長度從 128,000 個令牌擴展至 100 萬個令牌。這使得 Qwen2.5-Turbo 能夠處理十本完整的小說、150 小時的文字記錄或 30,000 行程式碼。
Perfect accuracy in retrieving numbers
檢索數字的完美準確性
The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.
該模型在密鑰檢索任務中的準確率達到 100%,該任務涉及在不相關文本的 100 萬個標記內查找隱藏數字,而不管資訊在文件中的位置如何。這似乎部分克服了「迷失在中間」的現象,其中語言模型主要關注提示的開頭和結尾。
In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.
在長文本理解的多項基準測試中,Qwen2.5-Turbo 的表現優於 GPT-4 和 GLM4-9B-1M 等競爭模型,同時保持與短序列 GPT-4o-mini 相當的表現。
A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen
Qwen 的一段螢幕錄製展示了新語言模型能夠快速概括劉慈欣完整的《三體世界》三部曲,該三部曲總共涵蓋 690,000 個 token。 |影片:Qwen
Sparse attention speeds up inference by 4.3x
稀疏注意力使推理速度提高 4.3 倍
By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.
透過採用稀疏注意力機制,Qwen 將處理 100 萬個 token 時到達第一個 token 的時間從 4.9 分鐘減少到 68 秒,速度提高了 4.3 倍。
The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.
成本仍為每 100 萬個代幣 0.3 元(4 美分)。在相同的價格點上,Qwen2.5-Turbo 現在可以處理的代幣數量是 GPT-4o-mini 的 3.6 倍。
Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.
Qwen2.5-Turbo 現在可以透過阿里雲 Model Studio 的 API 以及 HuggingFace 和 ModelScope 上的演示來存取。
How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images
DeepMind 的 Genie AI 如何透過影像生成互動式世界來重塑機器人技術
Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.
Qwen 指出,目前模型可能並不總是在現實應用中涉及長序列的任務上表現最佳。
There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.
仍有許多未解決的挑戰,包括模型在長序列下的性能較差以及阻礙更大模型部署的高推理成本。
In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.
未來,Qwen 的目標是進一步探索人類對長序列的偏好對齊,優化推理效率以減少計算時間,並將更大、更強大的長上下文模型推向市場。
Why bother with large context windows?
為什麼要使用大的上下文視窗呢?
The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.
近幾個月來,大型語言模型的上下文視窗一直在穩步增加。目前,實用標準已確定在128,000 (GPT-4o) 和200,000 (Claude 3.5 Sonnet) 代幣之間,儘管也有一些異常值,例如Gemini 1.5 Pro 的代幣數量高達1000 萬個,或Magic AI 的LTM- 2-mini 的代幣數量為1 億個。
While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.
雖然這些進步通常有助於大型語言模型的實用性,但研究一直質疑大型情境視窗與 RAG 系統相比的好處,在 RAG 系統中,附加資訊是從向量資料庫動態檢索的。
免責聲明:info@kdj.com
所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!
如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。
-
- Sui網路中斷引發人們對可靠性的擔憂
- 2024-11-22 09:10:01
- Layer-1 區塊鏈 Sui 因其強勁的性能和快速擴展的生態系統而成為今年的後起之秀之一。
-
- 據報道,川普的過渡著眼於比特幣儲備,希夫回擊財政失誤
- 2024-11-22 09:05:02
- 根據路透社報道,當選總統川普即將上任的加密貨幣顧問委員會正在考慮建立戰略比特幣儲備。
-
- 亞馬遜目前體育和健身器材的終極銷售
- 2024-11-22 08:40:01
- 想要強身健體嗎?以下是亞馬遜家庭健身房和健身器材的驚人銷售,可提供最佳的家庭鍛鍊。
-
- 在川普重返白宮之前,比特幣飆升至 99,000 美元以上
- 2024-11-22 08:40:01
- 比特幣的生態系統隨著創新的第 2 層應用程式而擴展。比特幣的極端主義是否阻礙了區塊鏈的潛力?