bitcoin
bitcoin

$100859.180394 USD

1.42%

ethereum
ethereum

$3635.907168 USD

-0.54%

xrp
xrp

$2.418895 USD

0.86%

tether
tether

$0.999947 USD

0.02%

bnb
bnb

$724.654298 USD

0.85%

solana
solana

$213.444344 USD

-1.84%

dogecoin
dogecoin

$0.385425 USD

-0.28%

usd-coin
usd-coin

$0.999861 USD

-0.04%

cardano
cardano

$1.132524 USD

4.76%

tron
tron

$0.269038 USD

1.32%

avalanche
avalanche

$43.033300 USD

-2.95%

sui
sui

$5.091890 USD

-2.06%

chainlink
chainlink

$23.149238 USD

-3.08%

toncoin
toncoin

$5.683351 USD

-0.54%

shiba-inu
shiba-inu

$0.000024 USD

-2.04%

加密貨幣新聞文章

G2PT:圖生成預訓練變壓器

2025/01/06 04:21

來自塔夫茨大學、東北大學和康乃爾大學的研究人員開發了圖生成預訓練變壓器(G2PT),這是一種自回歸模型,旨在透過下一個令牌預測來學習圖結構。

G2PT:圖生成預訓練變壓器

Graph generation is a critical task in diverse fields like molecular design and social network analysis, owing to its capacity to model intricate relationships and structured data. Despite recent advances, many graph generative models heavily rely on adjacency matrix representations. While effective, these methods can be computationally demanding and often lack flexibility, making it challenging to efficiently capture the complex dependencies between nodes and edges, especially for large and sparse graphs. Current approaches, including diffusion-based and auto-regressive models, encounter difficulties in terms of scalability and accuracy, highlighting the need for more refined solutions.

圖生成是分子設計和社交網路分析等不同領域的關鍵任務,因為它能夠建模複雜的關係和結構化資料。儘管最近取得了進展,但許多圖生成模型嚴重依賴鄰接矩陣表示。雖然有效,但這些方法可能對計算要求很高,並且通常缺乏靈活性,使得有效捕捉節點和邊之間的複雜依賴關係具有挑戰性,特別是對於大型稀疏圖。目前的方法,包括基於擴散和自回歸模型,在可擴展性和準確性方面遇到了困難,這凸顯了對更精細解決方案的需求。

In a recent study, a team of researchers from Tufts University, Northeastern University, and Cornell University introduces the Graph Generative Pre-trained Transformer (G2PT), an auto-regressive model designed to learn graph structures through next-token prediction. Unlike traditional methods, G2PT employs a sequence-based representation of graphs, encoding nodes and edges as sequences of tokens. This approach streamlines the modeling process, making it more efficient and scalable. By leveraging a transformer decoder for token prediction, G2PT generates graphs that maintain structural integrity and flexibility. Moreover, G2PT can be readily adapted to downstream tasks, such as goal-oriented graph generation and graph property prediction, serving as a versatile tool for various applications.

在最近的一項研究中,來自塔夫茨大學、東北大學和康乃爾大學的研究人員團隊介紹了圖生成預訓練Transformer (G2PT),這是一種自回歸模型,旨在透過下一個標記預測來學習圖結構。與傳統方法不同,G2PT 採用基於序列的圖表示,將節點和邊編碼為標記序列。這種方法簡化了建模過程,使其更有效率和可擴展。透過利用 Transformer 解碼器進行令牌預測,G2PT 產生保持結構完整性和靈活性的圖。此外,G2PT可以輕鬆適應下游任務,例如目標導向的圖生成和圖屬性預測,作為各種應用的多功能工具。

Technical Insights and Benefits

技術見解和優勢

G2PT introduces a novel sequence-based representation that decomposes graphs into node and edge definitions. Node definitions specify indices and types, whereas edge definitions outline connections and labels. This approach fundamentally differs from adjacency matrix representations, which focus on all possible edges, by considering only the existing edges, thereby reducing sparsity and computational complexity. The transformer decoder effectively models these sequences through next-token prediction, offering several advantages:

G2PT 引入了一種新穎的基於序列的表示,將圖分解為節點和邊定義。節點定義指定索引和類型,而邊定義概述連接和標籤。這種方法從根本上不同於鄰接矩陣表示,鄰接矩陣表示透過僅考慮現有邊緣來關注所有可能的邊緣,從而減少稀疏性和計算複雜性。 Transformer 解碼器透過下一個令牌預測有效地對這些序列進行建模,具有以下幾個優點:

The researchers also explored fine-tuning methods for tasks like goal-oriented generation and graph property prediction, broadening the model’s applicability.

研究人員也探索了目標導向的生成和圖屬性預測等任務的微調方法,擴大了模型的適用性。

Experimental Results and Insights

實驗結果和見解

G2PT has been evaluated on various datasets and tasks, demonstrating strong performance. In general graph generation, it matched or exceeded the state-of-the-art performance across seven datasets. In molecular graph generation, G2PT achieved high validity and uniqueness scores, reflecting its ability to accurately capture structural details. For instance, on the MOSES dataset, G2PTbase attained a validity score of 96.4% and a uniqueness score of 100%.

G2PT 已在各種數據集和任務上進行了評估,展示了強大的性能。在一般圖形生成方面,它在七個資料集上達到或超過了最先進的性能。在分子圖生成方面,G2PT 取得了很高的有效性和獨特性分數,反映了其準確捕捉結構細節的能力。例如,在 MOSES 資料集上,G2PTbase 的有效性得分為 96.4%,唯一性得分為 100%。

In a goal-oriented generation, G2PT aligned generated graphs with desired properties using fine-tuning techniques like rejection sampling and reinforcement learning. These methods enabled the model to adapt its outputs effectively. Similarly, in predictive tasks, G2PT’s embeddings delivered competitive results across molecular property benchmarks, reinforcing its suitability for both generative and predictive tasks.

在目標導向的世代中,G2PT 使用拒絕取樣和強化學習等微調技術將產生的圖與所需的屬性對齊。這些方法使模型能夠有效地調整其輸出。同樣,在預測任務中,G2PT 的嵌入在分子特性基準上提供了有競爭力的結果,增強了其對生成任務和預測任務的適用性。

Conclusion

結論

The Graph Generative Pre-trained Transformer (G2PT) represents a thoughtful step forward in graph generation. By employing a sequence-based representation and transformer-based modeling, G2PT addresses many limitations of traditional approaches. Its combination of efficiency, scalability, and adaptability makes it a valuable resource for researchers and practitioners. While G2PT shows sensitivity to graph orderings, further exploration of universal and expressive edge-ordering mechanisms could enhance its robustness. G2PT exemplifies how innovative representations and modeling approaches can advance the field of graph generation.

圖生成預訓練 Transformer (G2PT) 代表了圖生成領域向前邁出的深思熟慮的一步。透過採用基於序列的表示和基於變壓器的建模,G2PT 解決了傳統方法的許多限制。它集效率、可擴展性和適應性於一體,使其成為研究人員和從業者的寶貴資源。雖然 G2PT 對圖排序表現出敏感性,但對通用和富有表現力的邊排序機制的進一步探索可以增強其穩健性。 G2PT 舉例說明了創新的表示和建模方法如何推動圖形生成領域的發展。

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

查看論文。這項研究的所有功勞都歸功於該計畫的研究人員。另外,不要忘記在 Twitter 上關注我們並加入我們的 Telegram 頻道和 LinkedIn 群組。不要忘記加入我們 60k+ ML SubReddit。

🚨 FREE UPCOMING AI WEBINAR (JAN 15, 2025): Boost LLM Accuracy with Synthetic Data and Evaluation Intelligence–Join this webinar to gain actionable insights into boosting LLM model performance and accuracy while safeguarding data privacy.

🚨 即將舉行的免費人工智慧網路研討會(2025 年1 月15 日):利用合成數據和評估智慧提高LLM 準確性- 加入此網路研討會,獲得可操作的見解,以提高LLM 模型的性能和準確性,同時保護資料隱私。

新聞來源:www.marktechpost.com

免責聲明:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

2025年01月07日 其他文章發表於