bitcoin
bitcoin

$104649.101899 USD

2.67%

ethereum
ethereum

$3956.715152 USD

2.48%

tether
tether

$0.999420 USD

-0.02%

xrp
xrp

$2.392347 USD

-0.69%

solana
solana

$220.737585 USD

1.24%

bnb
bnb

$712.411643 USD

0.32%

dogecoin
dogecoin

$0.402463 USD

1.07%

usd-coin
usd-coin

$0.999861 USD

-0.01%

cardano
cardano

$1.079408 USD

1.59%

tron
tron

$0.282710 USD

1.16%

avalanche
avalanche

$49.260711 USD

0.36%

chainlink
chainlink

$29.260615 USD

1.09%

toncoin
toncoin

$6.398623 USD

3.46%

shiba-inu
shiba-inu

$0.000027 USD

-0.03%

sui
sui

$4.824484 USD

9.28%

加密貨幣新聞文章

Meta AI 推出大型概念模型 (LCM),這是一種推進語言建模的新架構

2024/12/16 08:44

大型語言模型 (LLM) 在自然語言處理 (NLP) 方面取得了顯著的進步,使得在文本生成、摘要和問答方面的應用成為可能。然而,它們對標記級處理(一次預測一個單字)的依賴帶來了挑戰。這種方法與人類交流形成鮮明對比,人類交流通常在更高的抽象層次上進行,例如句子或想法。

Meta AI 推出大型概念模型 (LCM),這是一種推進語言建模的新架構

Large Language Models (LLMs) have made impressive strides in natural language processing (NLP), with applications ranging from text generation to summarization and question-answering. However, these models typically rely on token-level processing, predicting one word at a time, which presents several challenges.

大型語言模型 (LLM) 在自然語言處理 (NLP) 方面取得了令人印象深刻的進步,其應用範圍從文本生成到摘要和問答。然而,這些模型通常依賴標記級處理,一次預測一個單詞,這帶來了一些挑戰。

In contrast, human communication operates at higher levels of abstraction, such as sentences or ideas. Token-level modeling also encounters difficulties in tasks requiring long-context understanding and may produce outputs with inconsistencies. Furthermore, extending these models to multilingual and multimodal applications is computationally expensive and data-intensive.

相較之下,人類交流在更高的抽象層次上進行,例如句子或想法。令牌級建模在需要長上下文理解的任務中也會遇到困難,並且可能會產生不一致的輸出。此外,將這些模型擴展到多語言和多模式應用程式的計算成本很高且資料密集。

To overcome these limitations, a team of researchers at Meta AI has proposed a new approach called Large Concept Models (LCMs).

為了克服這些限制,Meta AI 的一組研究人員提出了一種稱為大型概念模型 (LCM) 的新方法。

Meta AI’s Large Concept Models (LCMs) mark a departure from traditional LLM architectures. LCMs introduce two key innovations:

Meta AI 的大型概念模型 (LCM) 標誌著與傳統法學碩士架構的背離。 LCM 引入了兩項關鍵創新:

At the heart of LCMs are concept encoders and decoders that map input sentences into SONAR’s embedding space and decode embeddings back into natural language or other modalities. These components are frozen, enabling modularity and ease of extension to new languages or modalities without retraining the entire model.

LCM 的核心是概念編碼器和解碼器,它們將輸入句子映射到 SONAR 的嵌入空間並將嵌入解碼回自然語言或其他模式。這些元件被凍結,從而實現模組化並易於擴展到新語言或模式,而無需重新訓練整個模型。

Technical Details and Benefits of LCMs

LCM 的技術細節與優勢

LCMs incorporate several innovations to enhance language modeling:

LCM 融合了多項創新來增強語言建模:

Insights from Experimental Results

實驗結果的見解

Meta AI’s experiments showcase the capabilities of LCMs. A diffusion-based Two-Tower LCM scaled to 7 billion parameters achieved competitive performance in tasks like summarization. Key results include:

Meta AI 的實驗展示了 LCM 的功能。基於擴散的兩塔 LCM 可擴展至 70 億個參數,在摘要等任務中實現了具有競爭力的效能。主要結果包括:

Conclusion

結論

Meta AI’s Large Concept Models present a promising alternative to traditional token-based language models. By utilizing high-dimensional concept embeddings and modality-agnostic processing, LCMs address fundamental limitations of existing approaches. Their hierarchical architecture improves coherence and efficiency, while their strong zero-shot generalization expands their applicability to diverse languages and modalities. As research into this architecture continues, LCMs have the potential to redefine the capabilities of language models, offering a more scalable and adaptable approach to AI-driven communication.

Meta AI 的大型概念模型為傳統的基於標記的語言模型提供了一個有前景的替代方案。透過利用高維概念嵌入和模態不可知處理,LCM 解決了現有方法的基本限制。它們的分層架構提高了一致性和效率,而強大的零樣本泛化能力則擴展了它們對不同語言和模式的適用性。隨著對此架構的研究不斷進行,LCM 有可能重新定義語言模型的功能,為人工智慧驅動的通訊提供更具可擴展性和適應性的方法。

Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

請參閱 Paper 和 GitHub 頁面。這項研究的所有功勞都歸功於該計畫的研究人員。另外,不要忘記在 Twitter 上關注我們並加入我們的 Telegram 頻道和 LinkedIn 群組。不要忘記加入我們 60k+ ML SubReddit。

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence

熱門話題:LG AI Research 發表 EXAONE 3.5:三個開源雙語前沿 AI 級模型,提供無與倫比的指令追蹤和長上下文理解,實現卓越生成 AI 的全球領先地位

新聞來源:www.marktechpost.com

免責聲明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2024年12月16日 其他文章發表於