$101244.976694 USD

-5.49%

ethereum

$3271.830897 USD

-10.25%

tether

$0.999910 USD

0.01%

xrp

$2.203218 USD

-6.18%

bnb

$931.103781 USD

-6.80%

solana

$154.562730 USD

-8.10%

usd-coin

$0.999831 USD

0.00%

tron

$0.284026 USD

0.14%

dogecoin

$0.162344 USD

-4.62%

cardano

$0.521007 USD

-6.73%

hyperliquid

$39.376100 USD

-2.26%

chainlink

$14.598367 USD

-5.36%

bitcoin-cash

$480.247856 USD

-5.84%

ethena-usde

$0.999180 USD

0.00%

unus-sed-leo

$9.547776 USD

-0.25%

암호화폐 뉴스 기사

자가 회귀 언어 모델은 파멸되지 않습니다.

2025/02/12 08:15

Meta의 AI 과학자이자 Modern AI의 개척자 중 한 명인 Yann Lecun은 최근에 자동 회귀 대형 언어 모델 (LLM)이 근본적으로 결함이 있다고 주장했다.

Yann LeCun, Chief AI Scientist at Meta and one of the pioneers of modern AI, recently shared his thoughts on a fundamental limitation he sees in autoregressive Large Language Models (LLMs). According to LeCun, the probability of generating a correct response decreases exponentially with each token, making them impractical for long-form, reliable AI interactions.

Meta의 AI 과학자이자 Modern AI의 개척자 중 한 명인 Yann Lecun은 최근 Autoregressive Language Models (LLM)에서 본 근본적인 한계에 대한 그의 생각을 공유했습니다. Lecun에 따르면, 정확한 응답을 생성 할 확률은 각 토큰에 대해 기하 급수적으로 감소하여 긴 형식의 신뢰할 수있는 AI 상호 작용에 비현실적입니다.

While I deeply respect LeCun’s work and approach to AI development and resonate with many of his insights, I believe this particular claim overlooks some key aspects of how LLMs function in practice. In this post, I’ll explain why autoregressive models are not inherently divergent and doomed, and how techniques like Chain-of-Thought (CoT) and Attentive Reasoning Queries (ARQs)—a method we’ve developed to achieve high-accuracy customer interactions with Parlant—effectively prove otherwise.

Lecun의 작업과 AI 개발에 대한 접근 방식을 깊이 존중하고 그의 많은 통찰력에 공명하지만,이 특정 주장은 실제로 LLMS 기능의 주요 측면을 간과한다고 생각합니다. 이 게시물에서는 자동 회귀 모델이 본질적으로 발산되고 파멸되지 않은 이유와 고정심 고객을 달성하기 위해 개발 한 방법입니다. Parlant와의 상호 작용 - 그렇지 않으면 효율적으로 증명합니다.

What is Autoregression?

자가 회귀 란 무엇입니까?

At its core, an LLM is a probabilistic model trained to generate text one token at a time. Given an input context, the model predicts the most likely next token, feeds it back into the original sequence, and repeats the process iteratively until a stop condition is met. This allows the model to generate anything from short responses to entire articles.

핵심적으로 LLM은 한 번에 하나의 토큰을 생성하도록 훈련 된 확률 론적 모델입니다. 입력 컨텍스트가 주어지면, 모델은 다음 토큰을 가장 많이 예측하고 원래 시퀀스로 다시 공급하며 정지 조건이 충족 될 때까지 프로세스를 반복적으로 반복합니다. 이를 통해 모델은 짧은 응답에서 전체 기사에 이르기까지 모든 것을 생성 할 수 있습니다.

For a deeper dive into autoregression, check out our recent technical blog post.

자동 회귀에 대한 더 깊이 다이빙하려면 최근 기술 블로그 게시물을 확인하십시오.

Do Generation Errors Compound Exponentially?

생성 오류는 기하 급수적으로 복합적으로 있습니까?

LeCun’s argument can be unpacked as follows:

Lecun의 주장은 다음과 같이 포장을 풀 수 있습니다.

Let P be the probability of making a generation error at each token.

p를 각 토큰에서 생성 오류를 만들 확률이되도록하십시오.

For an output of length n, the probability of maintaining coherence is (1-E)^n.

길이 n의 출력의 경우, 일관성을 유지할 확률은 (1-e)^n입니다.

This leads to LeCun’s conclusion that for sufficiently long responses, the likelihood of maintaining coherence exponentially approaches zero, suggesting that autoregressive LLMs are inherently flawed.

이것은 충분히 긴 반응의 경우 일관성을 유지할 가능성이 기하 급수적으로 0에 접근하여자가 회귀 LLM이 본질적으로 결함이 있음을 시사한다는 Lecun의 결론으로 이어진다.

But here’s the problem: E is not constant.

그러나 여기에 문제가 있습니다. E는 일정하지 않습니다.

To put it simply, LeCun’s argument assumes that the probability of making a mistake in each new token is independent. However, LLMs don’t work that way.

간단히 말해서, Lecun의 주장은 각 새로운 토큰에서 실수를 할 확률이 독립적이라고 가정합니다. 그러나 LLM은 그런 식으로 작동하지 않습니다.

As an analogy to what allows LLMs to overcome this problem, imagine you’re telling a story: if you make a mistake in one sentence, you can still correct it in the next one to keep the narrative coherent. The same applies to LLMs, especially when techniques like Chain-of-Thought (CoT) prompting guide them toward better reasoning by helping them reassess their own outputs along the way.

LLMS 가이 문제를 극복 할 수있게하는 것과 유사하게 이야기를하고 있다고 상상해보십시오. 한 문장으로 실수를한다면, 다음 내용에서 이야기를 일관성있게 유지할 수 있습니다. 특히 COT (Chain of-Thought)와 같은 기술이 그 과정에서 자신의 출력을 재평가함으로써 더 나은 추론을 안내하는 경우 LLM에도 동일하게 적용됩니다.

Why This Assumption is Flawed

이 가정이 결함이있는 이유

LLMs exhibit self-correction properties that prevent them from spiraling into incoherence.

LLMS는 자체 정복 특성을 나타내며, 이들이 일관성이 나빠지는 것을 방지합니다.

Take Chain-of-Thought (CoT) prompting, which encourages the model to generate intermediate reasoning steps. CoT allows the model to consider multiple perspectives, improving its ability to converge to an acceptable answer. Similarly, Chain-of-Verification (CoV) and structured feedback mechanisms like ARQs guide the model in reinforcing valid outputs and discarding erroneous ones.

이 모델이 중간 추론 단계를 생성하도록 장려하는 체인의 사슬 (COT) 프롬프트를 취하십시오. COT를 통해 모델은 여러 관점을 고려하여 허용 가능한 답변으로 수렴하는 능력을 향상시킬 수 있습니다. 마찬가지로, ARQS와 같은 COV (Chain-of-Verification) 및 구조화 된 피드백 메커니즘은 유효한 출력을 강화하고 잘못된 출력을 버리는 모델을 안내합니다.

A small mistake early on in the generation process doesn’t necessarily doom the final answer. Figuratively speaking, an LLM can double-check its work, backtrack, and correct errors on the go.

세대 과정 초기에 작은 실수가 반드시 최종 답변을 파멸시키는 것은 아닙니다. 비 유적으로 말하면, LLM은 작업, 역 추적 및 이동 중에 오류를 수정할 수 있습니다.

Attentive Reasoning Queries (ARQs) are a Game-Changer

세심한 추론 쿼리 (ARQ)는 게임 체인저입니다

At Parlant, we’ve taken this principle further in our work on Attentive Reasoning Queries (a research paper describing our results is currently in the works, but the implementation pattern can be explored in our open-source codebase). ARQs introduce reasoning blueprints that help the model maintain coherence throughout long completions by dynamically refocusing attention on key instructions at strategic points in the completion process, continuously preventing LLMs from diverging into incoherence. Using them, we’ve been able to maintain a large test suite that exhibits close to 100% consistency in generating correct completions for complex tasks.

Parlant에서는 세심한 추론 쿼리에 대한 연구 에서이 원칙을 더욱 취했습니다 (결과를 설명하는 연구 논문은 현재 작동하지만 구현 패턴은 오픈 소스 코드베이스에서 탐색 할 수 있음). ARQS는 완료 프로세스의 전략적 지점에서 주요 지침에 동적으로주의를 기울여서 LLM이 지속적으로 발산되는 것을 방지함으로써 장기 완료 동안 모델이 일관성을 유지하는 데 도움이되는 추론 청사진을 소개합니다. 그것들을 사용하여, 우리는 복잡한 작업에 대한 올바른 완성을 생성하는 데 100%에 가까운 대형 테스트 스위트를 유지할 수있었습니다.

This technique allows us to achieve much higher accuracy in AI-driven reasoning and instruction-following, which has been critical for us in enabling reliable and aligned customer-facing applications.

이 기술을 통해 AI 중심의 추론 및 지시 금색에서 훨씬 더 높은 정확도를 달성 할 수 있으며, 이는 신뢰할 수 있고 정렬 된 고객 대면 응용 프로그램을 가능하게하는 데 중요했습니다.

Autoregressive Models Are Here to Stay

자동 회귀 모델이 여기에 있습니다

We think autoregressive LLMs are far from doomed. While long-form coherence is a challenge, assuming an exponentially compounding error rate ignores key mechanisms that mitigate divergence—from Chain-of-Thought reasoning to structured reasoning like ARQs.

우리는자가 회귀 LLM이 파멸과는 거리가 멀다고 생각합니다. 기하 급수적으로 복합 오류율이 기하 급수적으로 복합 오류율을 가정한다고 가정하면, 고려 된 추론에서 ARQ와 같은 구조화 된 추론에 이르기까지 발산을 완화시키는 주요 메커니즘을 무시한다고 가정 할 때.

If you’re interested in AI alignment and increasing the accuracy of chat agents using LLMs, feel free to explore Parlant’s open-source effort. Let’s continue refining how LLMs generate and structure knowledge.

LLM을 사용하여 채팅 에이전트의 정확성을 높이고 AI 정렬에 관심이 있다면 의회의 오픈 소스 노력을 자유롭게 탐색하십시오. LLM이 지식을 생성하고 구조화하는 방법을 계속 정제합시다.

원본 소스：marktechpost

부인 성명:info@kdj.com

제공된 정보는 거래 조언이 아닙니다. kdj.com은 이 기사에 제공된 정보를 기반으로 이루어진 투자에 대해 어떠한 책임도 지지 않습니다. 암호화폐는 변동성이 매우 높으므로 철저한 조사 후 신중하게 투자하는 것이 좋습니다!

2025年11月06日 에 게재된 다른 기사

더