$84720.887476 USD

1.85%

ethereum

$1882.087494 USD

2.47%

tether

$0.999992 USD

0.02%

xrp

$2.103516 USD

-0.28%

bnb

$603.720228 USD

-0.90%

solana

$124.907077 USD

-1.26%

usd-coin

$1.000009 USD

0.00%

dogecoin

$0.171794 USD

1.56%

cardano

$0.672517 USD

0.21%

tron

$0.238010 USD

0.94%

toncoin

$3.982310 USD

-4.11%

chainlink

$13.782927 USD

0.53%

unus-sed-leo

$9.409232 USD

2.25%

stellar

$0.268957 USD

0.85%

avalanche

$19.348366 USD

1.29%

Cryptocurrency News Articles

Qwen2.5-Turbo: Alibaba's New AI Language Model Can Process Ten Complete Novels

Nov 21, 2024 at 11:47 pm

Alibaba's AI laboratory has introduced a new version of its Qwen language model that can process up to one million tokens of text—equivalent to about ten novels.

Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.

Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.

Perfect accuracy in retrieving numbers

The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.

In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.

A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen

Sparse attention speeds up inference by 4.3x

By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.

The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.

Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.

How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images

Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.

There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.

In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.

Why bother with large context windows?

The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.

While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.

Disclaimer:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research！

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

Other articles published on Apr 03, 2025