|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
阿里巴巴的人工智能实验室推出了新版本的 Qwen 语言模型,可以处理多达 100 万个文本标记,相当于大约 10 部小说。
Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.
阿里巴巴的人工智能实验室推出了 Qwen 语言模型的新版本,能够处理数量惊人的 100 万个文本标记——大约相当于 10 部小说的价值。此外,该团队成功地将处理速度提高了四倍。
Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.
Qwen 现已将 9 月份推出的 Qwen2.5 语言模型的上下文长度从 128,000 个令牌扩展至 100 万个令牌。这使得 Qwen2.5-Turbo 能够处理十本完整的小说、150 小时的文字记录或 30,000 行代码。
Perfect accuracy in retrieving numbers
检索数字的完美准确性
The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.
该模型在密钥检索任务中的准确率达到 100%,该任务涉及在不相关文本的 100 万个标记内查找隐藏数字,而不管信息在文档中的位置如何。这似乎部分克服了“迷失在中间”的现象,其中语言模型主要关注提示的开头和结尾。
In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.
在长文本理解的多项基准测试中,Qwen2.5-Turbo 的性能优于 GPT-4 和 GLM4-9B-1M 等竞争模型,同时保持与短序列 GPT-4o-mini 相当的性能。
A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen
Qwen 的一段屏幕录制展示了新语言模型能够快速概括刘慈欣完整的《三体世界》三部曲,该三部曲总共涵盖 690,000 个 token。 |视频:Qwen
Sparse attention speeds up inference by 4.3x
稀疏注意力使推理速度提高 4.3 倍
By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.
通过采用稀疏注意力机制,Qwen 将处理 100 万个 token 时到达第一个 token 的时间从 4.9 分钟减少到 68 秒,速度提高了 4.3 倍。
The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.
成本仍为每 100 万个代币 0.3 元(4 美分)。在相同的价格点上,Qwen2.5-Turbo 现在可以处理的代币数量是 GPT-4o-mini 的 3.6 倍。
Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.
Qwen2.5-Turbo 现在可以通过阿里云 Model Studio 的 API 以及 HuggingFace 和 ModelScope 上的演示来访问。
How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images
DeepMind 的 Genie AI 如何通过图像生成交互式世界来重塑机器人技术
Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.
Qwen 指出,当前模型可能并不总是在现实应用中涉及长序列的任务上表现最佳。
There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.
仍有许多未解决的挑战,包括模型在长序列下的性能较差以及阻碍更大模型部署的高推理成本。
In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.
未来,Qwen 的目标是进一步探索人类对长序列的偏好对齐,优化推理效率以减少计算时间,并将更大、更强大的长上下文模型推向市场。
Why bother with large context windows?
为什么要使用大的上下文窗口呢?
The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.
近几个月来,大型语言模型的上下文窗口一直在稳步增加。目前,实用标准已确定在 128,000 (GPT-4o) 和 200,000 (Claude 3.5 Sonnet) 代币之间,尽管也有一些异常值,例如 Gemini 1.5 Pro 的代币数量高达 1000 万个,或 Magic AI 的 LTM-2-mini 的代币数量为 1 亿个。
While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.
虽然这些进步通常有助于大型语言模型的实用性,但研究一直质疑大型上下文窗口与 RAG 系统相比的好处,在 RAG 系统中,附加信息是从向量数据库动态检索的。
免责声明:info@kdj.com
所提供的信息并非交易建议。根据本文提供的信息进行的任何投资,kdj.com不承担任何责任。加密货币具有高波动性,强烈建议您深入研究后,谨慎投资!
如您认为本网站上使用的内容侵犯了您的版权,请立即联系我们(info@kdj.com),我们将及时删除。
-
- 比特币 ETF 资产规模突破 1000 亿美元,接近黄金 ETF
- 2024-11-22 14:25:01
- 据彭博社报道,11 月 21 日,美国比特币 ETF 净资产首次突破 1000 亿美元。
-
- 比特币飙升使中本聪跻身第 18 位富豪
- 2024-11-22 14:25:01
- 随着第一个也是最大的加密货币比特币的市值突破 98,000 美元,比特币的匿名创造者中本聪在全球最富有的人中排名第 18 位