$108166.236572 USD

0.05%

ethereum

$2515.590590 USD

-0.11%

tether

$1.000285 USD

-0.01%

xrp

$2.216184 USD

-0.32%

bnb

$655.165841 USD

0.05%

solana

$147.119253 USD

-0.66%

usd-coin

$1.000016 USD

0.00%

tron

$0.283596 USD

0.48%

dogecoin

$0.163766 USD

0.36%

cardano

$0.572467 USD

-0.10%

hyperliquid

$39.371467 USD

0.63%

sui

$2.897088 USD

0.28%

bitcoin-cash

$487.010658 USD

0.33%

chainlink

$13.191270 USD

-0.04%

unus-sed-leo

$9.039695 USD

-0.03%

암호화폐 뉴스 기사

Qwen2.5-Turbo: Alibaba의 새로운 AI 언어 모델로 10권의 완전한 소설을 처리할 수 있습니다

2024/11/21 23:47

Alibaba의 AI 연구소는 약 10권의 소설에 해당하는 최대 100만 개의 텍스트 토큰을 처리할 수 있는 새로운 버전의 Qwen 언어 모델을 도입했습니다.

Alibaba's AI lab has unveiled a new iteration of its Qwen language model, capable of processing an astonishing one million tokens of text—roughly ten novels' worth. Moreover, the team managed to quadruple the processing speed.

Alibaba의 AI 연구소는 약 10권의 소설 분량에 해당하는 100만 개의 텍스트 토큰을 처리할 수 있는 새로운 Qwen 언어 모델을 공개했습니다. 게다가 팀은 처리 속도를 4배로 높이는 데 성공했습니다.

Qwen has now expanded the context length of its Qwen2.5 language model, introduced in September, from 128,000 to 1 million tokens. This enables Qwen2.5-Turbo to handle ten complete novels, 150 hours of transcripts, or 30,000 lines of code.

Qwen은 이제 9월에 도입된 Qwen2.5 언어 모델의 컨텍스트 길이를 128,000개에서 100만 개 토큰으로 확장했습니다. 이를 통해 Qwen2.5-Turbo는 10권의 완전한 소설, 150시간의 스크립트 또는 30,000줄의 코드를 처리할 수 있습니다.

Perfect accuracy in retrieving numbers

숫자 검색의 완벽한 정확성

The model erzielt 100 percent accuracy in the passkey retrieval task, which involves finding hidden numbers within 1 million tokens of irrelevant text, irrespective of the information's position in the document. This seems to partially overcome the "lost in the middle" phenomenon, where language models mainly focus on the beginning and end of a prompt.

이 모델은 문서 내 정보 위치에 관계없이 관련 없는 텍스트의 100만 개 토큰 내에서 숨겨진 숫자를 찾는 암호 키 검색 작업에서 100% 정확도를 보장합니다. 이는 언어 모델이 주로 프롬프트의 시작과 끝에 초점을 맞추는 "중간 손실" 현상을 부분적으로 극복한 것으로 보입니다.

In several benchmarks for long text comprehension, Qwen2.5-Turbo outperforms competing models like GPT-4 and GLM4-9B-1M, while maintaining performance comparable to GPT-4o-mini with short sequences.

긴 텍스트 이해에 대한 여러 벤치마크에서 Qwen2.5-Turbo는 GPT-4 및 GLM4-9B-1M과 같은 경쟁 모델을 능가하는 동시에 짧은 시퀀스에서도 GPT-4o-mini와 비슷한 성능을 유지합니다.

A screen recording from Qwen demonstrates the new language model's ability to quickly summarize Cixin Liu's complete "Trisolaris" trilogy, which spans a total of 690,000 tokens. | Video: Qwen

Qwen의 화면 녹화는 총 690,000개의 토큰에 걸쳐 있는 Cixin Liu의 완전한 "Trisolaris" 3부작을 신속하게 요약하는 새로운 언어 모델의 능력을 보여줍니다. | 비디오: 퀀

Sparse attention speeds up inference by 4.3x

희박한 주의로 추론 속도가 4.3배 향상됩니다.

By employing sparse attention mechanisms, Qwen reduced the time to first token when processing 1 million tokens from 4.9 minutes to 68 seconds, marking a 4.3x speed increase.

Qwen은 Sparse Attention 메커니즘을 사용하여 100만 개의 토큰을 처리할 때 첫 번째 토큰을 처리하는 데 걸리는 시간을 4.9분에서 68초로 단축하여 속도를 4.3배 향상시켰습니다.

The cost remains at 0.3 yuan (4 cents) per 1 million tokens. At the same price point, Qwen2.5-Turbo can now process 3.6x as many tokens as GPT-4o-mini.

비용은 1백만 토큰당 0.3위안(4센트)으로 유지됩니다. 동일한 가격대에서 Qwen2.5-Turbo는 이제 GPT-4o-mini보다 3.6배 많은 토큰을 처리할 수 있습니다.

Qwen2.5-Turbo is now accessible via Alibaba Cloud Model Studio's API, as well as through demos on HuggingFace and ModelScope.

Qwen2.5-Turbo는 이제 Alibaba Cloud Model Studio의 API는 물론 HuggingFace 및 ModelScope의 데모를 통해 액세스할 수 있습니다.

How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images

DeepMind의 Genie AI가 이미지에서 대화형 세계를 생성하여 로봇 공학을 재구성하는 방법

Qwen notes that the current model may not always perform optimally on tasks involving long sequences in real-world applications.

Qwen은 현재 모델이 실제 응용 프로그램의 긴 시퀀스와 관련된 작업에서 항상 최적으로 수행되지 않을 수 있다고 지적합니다.

There are still many unsolved challenges, including the model's weaker performance with long sequences and the high inference costs that hinder the deployment of larger models.

긴 시퀀스로 인해 모델의 성능이 저하되고 더 큰 모델의 배포를 방해하는 높은 추론 비용을 포함하여 아직 해결되지 않은 과제가 많이 있습니다.

In the future, Qwen aims to further explore human preference alignment for long sequences, optimize inference efficiency to reduce computation time, and bring larger, more capable models with long context to market.

앞으로 Qwen은 긴 시퀀스에 대한 인간의 선호도 정렬을 더욱 탐구하고, 추론 효율성을 최적화하여 계산 시간을 줄이고, 긴 컨텍스트를 갖춘 더 크고 성능이 뛰어난 모델을 시장에 출시하는 것을 목표로 합니다.

Why bother with large context windows?

큰 컨텍스트 창을 사용하는 데 왜 신경을 쓰나요?

The context windows of large language models have been steadily increasing in recent months. A practical standard has now settled between 128,000 (GPT-4o) and 200,000 (Claude 3.5 Sonnet) tokens, although there are outliers like Gemini 1.5 Pro with up to 10 million or Magic AI's LTM-2-mini with 100 million tokens.

대규모 언어 모델의 컨텍스트 창은 최근 몇 달 동안 꾸준히 증가해 왔습니다. 실제 표준은 이제 128,000(GPT-4o)에서 200,000(Claude 3.5 Sonnet) 토큰 사이로 확정되었습니다. 하지만 Gemini 1.5 Pro(최대 1,000만 개) 또는 Magic AI의 LTM-2-mini(1억 개 토큰)와 같은 이상값도 있습니다.

While these advancements generally contribute to the usefulness of large language models, studies have consistently questioned the benefit of large context windows in comparison to RAG systems, where additional information is retrieved dynamically from vector databases.

이러한 발전은 일반적으로 대규모 언어 모델의 유용성에 기여하지만, 연구에서는 추가 정보가 벡터 데이터베이스에서 동적으로 검색되는 RAG 시스템과 비교하여 대규모 컨텍스트 창의 이점에 대해 지속적으로 의문을 제기해 왔습니다.

부인 성명:info@kdj.com

제공된 정보는 거래 조언이 아닙니다. kdj.com은 이 기사에 제공된 정보를 기반으로 이루어진 투자에 대해 어떠한 책임도 지지 않습니다. 암호화폐는 변동성이 매우 높으므로 철저한 조사 후 신중하게 투자하는 것이 좋습니다!

2025年07月07日 에 게재된 다른 기사

더