bitcoin
bitcoin

$104758.509990 USD

2.82%

ethereum
ethereum

$3959.298910 USD

2.68%

tether
tether

$0.999550 USD

-0.01%

xrp
xrp

$2.396367 USD

-0.27%

solana
solana

$220.838042 USD

1.44%

bnb
bnb

$713.030062 USD

0.48%

dogecoin
dogecoin

$0.402853 USD

1.28%

usd-coin
usd-coin

$0.999951 USD

0.01%

cardano
cardano

$1.082657 USD

2.11%

tron
tron

$0.282613 USD

1.37%

avalanche
avalanche

$49.327730 USD

0.79%

chainlink
chainlink

$29.149382 USD

1.60%

toncoin
toncoin

$6.395195 USD

3.34%

shiba-inu
shiba-inu

$0.000027 USD

0.25%

sui
sui

$4.824827 USD

9.52%

加密货币新闻

Meta AI 推出大型概念模型 (LCM),这是一种推进语言建模的新架构

2024/12/16 08:44

大型语言模型 (LLM) 在自然语言处理 (NLP) 方面取得了显着的进步,使得在文本生成、摘要和问答方面的应用成为可能。然而,它们对标记级处理(一次预测一个单词)的依赖带来了挑战。这种方法与人类交流形成鲜明对比,人类交流通常在更高的抽象层次上进行,例如句子或想法。

Meta AI 推出大型概念模型 (LCM),这是一种推进语言建模的新架构

Large Language Models (LLMs) have made impressive strides in natural language processing (NLP), with applications ranging from text generation to summarization and question-answering. However, these models typically rely on token-level processing, predicting one word at a time, which presents several challenges.

大型语言模型 (LLM) 在自然语言处理 (NLP) 方面取得了令人印象深刻的进步,其应用范围从文本生成到摘要和问答。然而,这些模型通常依赖于标记级处理,一次预测一个单词,这带来了一些挑战。

In contrast, human communication operates at higher levels of abstraction, such as sentences or ideas. Token-level modeling also encounters difficulties in tasks requiring long-context understanding and may produce outputs with inconsistencies. Furthermore, extending these models to multilingual and multimodal applications is computationally expensive and data-intensive.

相比之下,人类交流在更高的抽象层次上进行,例如句子或想法。令牌级建模在需要长上下文理解的任务中也会遇到困难,并且可能会产生不一致的输出。此外,将这些模型扩展到多语言和多模式应用程序的计算成本很高且数据密集。

To overcome these limitations, a team of researchers at Meta AI has proposed a new approach called Large Concept Models (LCMs).

为了克服这些限制,Meta AI 的一组研究人员提出了一种称为大型概念模型 (LCM) 的新方法。

Meta AI’s Large Concept Models (LCMs) mark a departure from traditional LLM architectures. LCMs introduce two key innovations:

Meta AI 的大型概念模型 (LCM) 标志着与传统法学硕士架构的背离。 LCM 引入了两项关键创新:

At the heart of LCMs are concept encoders and decoders that map input sentences into SONAR’s embedding space and decode embeddings back into natural language or other modalities. These components are frozen, enabling modularity and ease of extension to new languages or modalities without retraining the entire model.

LCM 的核心是概念编码器和解码器,它们将输入句子映射到 SONAR 的嵌入空间并将嵌入解码回自然语言或其他模式。这些组件被冻结,从而实现模块化并易于扩展到新语言或模式,而无需重新训练整个模型。

Technical Details and Benefits of LCMs

LCM 的技术细节和优势

LCMs incorporate several innovations to enhance language modeling:

LCM 融合了多项创新来增强语言建模:

Insights from Experimental Results

实验结果的见解

Meta AI’s experiments showcase the capabilities of LCMs. A diffusion-based Two-Tower LCM scaled to 7 billion parameters achieved competitive performance in tasks like summarization. Key results include:

Meta AI 的实验展示了 LCM 的功能。基于扩散的两塔 LCM 可扩展至 70 亿个参数,在摘要等任务中实现了具有竞争力的性能。主要结果包括:

Conclusion

结论

Meta AI’s Large Concept Models present a promising alternative to traditional token-based language models. By utilizing high-dimensional concept embeddings and modality-agnostic processing, LCMs address fundamental limitations of existing approaches. Their hierarchical architecture improves coherence and efficiency, while their strong zero-shot generalization expands their applicability to diverse languages and modalities. As research into this architecture continues, LCMs have the potential to redefine the capabilities of language models, offering a more scalable and adaptable approach to AI-driven communication.

Meta AI 的大型概念模型为传统的基于标记的语言模型提供了一种有前景的替代方案。通过利用高维概念嵌入和模态不可知处理,LCM 解决了现有方法的基本局限性。它们的分层架构提高了一致性和效率,而强大的零样本泛化能力则扩展了它们对不同语言和模式的适用性。随着对该架构的研究不断进行,LCM 有可能重新定义语言模型的功能,为人工智能驱动的通信提供更具可扩展性和适应性的方法。

Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

查看 Paper 和 GitHub 页面。这项研究的所有功劳都归功于该项目的研究人员。另外,不要忘记在 Twitter 上关注我们并加入我们的 Telegram 频道和 LinkedIn 群组。不要忘记加入我们 60k+ ML SubReddit。

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence

热门话题:LG AI Research 发布 EXAONE 3.5:三个开源双语前沿 AI 级模型,提供无与伦比的指令跟踪和长上下文理解,实现卓越生成 AI 的全球领先地位

新闻来源:www.marktechpost.com

免责声明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2024年12月16日 发表的其他文章