bitcoin
bitcoin

$98057.46 USD 

3.32%

ethereum
ethereum

$3319.91 USD 

7.53%

tether
tether

$1.00 USD 

0.00%

solana
solana

$253.55 USD 

6.61%

bnb
bnb

$626.83 USD 

2.99%

xrp
xrp

$1.31 USD 

17.83%

dogecoin
dogecoin

$0.382011 USD 

-0.38%

usd-coin
usd-coin

$0.999761 USD 

-0.02%

cardano
cardano

$0.830619 USD 

3.77%

tron
tron

$0.198210 USD 

1.21%

avalanche
avalanche

$35.36 USD 

4.95%

shiba-inu
shiba-inu

$0.000024 USD 

1.16%

toncoin
toncoin

$5.47 USD 

2.65%

sui
sui

$3.54 USD 

-1.45%

bitcoin-cash
bitcoin-cash

$483.45 USD 

8.50%

加密貨幣新聞文章

Qwen2.5-Turbo:阿里巴巴全新AI語言模式可處理十本完整小說

2024/11/21 23:47

阿里巴巴的人工智慧實驗室推出了新版本的 Qwen 語言模型,可以處理多達 100 萬個文字標記,相當於大約 10 部小說。

Qwen2.5-Turbo:阿里巴巴全新AI語言模式可處理十本完整小說

Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.

阿里巴巴的人工智慧實驗室推出了 Qwen 語言模型的新版本,能夠處理數量驚人的 100 萬個文字標記——大約相當於 10 部小說的價值。此外,該團隊成功地將處理速度提高了四倍。

Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.

Qwen 現已將 9 月推出的 Qwen2.5 語言模型的上下文長度從 128,000 個令牌擴展至 100 萬個令牌。這使得 Qwen2.5-Turbo 能夠處理十本完整的小說、150 小時的文字記錄或 30,000 行程式碼。

Perfect accuracy in retrieving numbers

檢索數字的完美準確性

The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.

該模型在密鑰檢索任務中的準確率達到 100%,該任務涉及在不相關文本的 100 萬個標記內查找隱藏數字,而不管資訊在文件中的位置如何。這似乎部分克服了「迷失在中間」的現象,其中語言模型主要關注提示的開頭和結尾。

In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.

在長文本理解的多項基準測試中,Qwen2.5-Turbo 的表現優於 GPT-4 和 GLM4-9B-1M 等競爭模型,同時保持與短序列 GPT-4o-mini 相當的表現。

A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen

Qwen 的一段螢幕錄製展示了新語言模型能夠快速概括劉慈欣完整的《三體世界》三部曲,該三部曲總共涵蓋 690,000 個 token。 |影片:Qwen

Sparse attention speeds up inference by 4.3x

稀疏注意力使推理速度提高 4.3 倍

By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.

透過採用稀疏注意力機制,Qwen 將處理 100 萬個 token 時到達第一個 token 的時間從 4.9 分鐘減少到 68 秒,速度提高了 4.3 倍。

The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.

成本仍為每 100 萬個代幣 0.3 元(4 美分)。在相同的價格點上,Qwen2.5-Turbo 現在可以處理的代幣數量是 GPT-4o-mini 的 3.6 倍。

Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.

Qwen2.5-Turbo 現在可以透過阿里雲 Model Studio 的 API 以及 HuggingFace 和 ModelScope 上的演示來存取。

How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images

DeepMind 的 Genie AI 如何透過影像生成互動式世界來重塑機器人技術

Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.

Qwen 指出,目前模型可能並不總是在現實應用中涉及長序列的任務上表現最佳。

There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.

仍有許多未解決的挑戰,包括模型在長序列下的性能較差以及阻礙更大模型部署的高推理成本。

In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.

未來,Qwen 的目標是進一步探索人類對長序列的偏好對齊,優化推理效率以減少計算時間,並將更大、更強大的長上下文模型推向市場。

Why bother with large context windows?

為什麼要使用大的上下文視窗呢?

The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.

近幾個月來,大型語言模型的上下文視窗一直在穩步增加。目前,實用標準已確定在128,000 (GPT-4o) 和200,000 (Claude 3.5 Sonnet) 代幣之間,儘管也有一些異常值,例如Gemini 1.5 Pro 的代幣數量高達1000 萬個,或Magic AI 的LTM- 2-mini 的代幣數量為1 億個。

While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.

雖然這些進步通常有助於大型語言模型的實用性,但研究一直質疑大型情境視窗與 RAG 系統相比的好處,在 RAG 系統中,附加資訊是從向量資料庫動態檢索的。

新聞來源:the-decoder.com

免責聲明:info@kdj.com

所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!

如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。

2024年11月22日 其他文章發表於