Market Cap: $3.4821T -2.060%
Volume(24h): $309.7074B -13.570%
  • Market Cap: $3.4821T -2.060%
  • Volume(24h): $309.7074B -13.570%
  • Fear & Greed Index:
  • Market Cap: $3.4821T -2.060%
Cryptos
Topics
Cryptospedia
News
CryptosTopics
Videos
Top News
Cryptos
Topics
Cryptospedia
News
CryptosTopics
Videos
bitcoin
bitcoin

$101955.948589 USD

-5.77%

ethereum
ethereum

$3240.290540 USD

-5.16%

xrp
xrp

$3.047708 USD

-4.22%

tether
tether

$0.998785 USD

0.05%

solana
solana

$236.757836 USD

-8.37%

bnb
bnb

$679.662946 USD

-3.34%

dogecoin
dogecoin

$0.340845 USD

-9.87%

usd-coin
usd-coin

$1.000086 USD

0.01%

cardano
cardano

$0.973881 USD

-8.36%

tron
tron

$0.238271 USD

-0.55%

chainlink
chainlink

$24.088213 USD

-7.00%

avalanche
avalanche

$35.090742 USD

-7.85%

stellar
stellar

$0.432208 USD

-6.63%

sui
sui

$4.304171 USD

-8.81%

hedera
hedera

$0.329054 USD

-7.24%

Cryptocurrency News Articles

Qwen2.5-Turbo: Alibaba's New AI Language Model Can Process Ten Complete Novels

Nov 21, 2024 at 11:47 pm

Alibaba's AI laboratory has introduced a new version of its Qwen language model that can process up to one million tokens of text—equivalent to about ten novels.

Qwen2.5-Turbo: Alibaba's New AI Language Model Can Process Ten Complete Novels

Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.

Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.

Perfect accuracy in retrieving numbers

The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.

In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.

A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen

Sparse attention speeds up inference by 4.3x

By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.

The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.

Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.

How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images

Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.

There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.

In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.

Why bother with large context windows?

The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.

While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.

Disclaimer:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

Other articles published on Jan 21, 2025