市值: $2.8414T -0.410%
體積(24小時): $56.2017B -56.090%
  • 市值: $2.8414T -0.410%
  • 體積(24小時): $56.2017B -56.090%
  • 恐懼與貪婪指數:
  • 市值: $2.8414T -0.410%
Cryptos
主題
Cryptospedia
資訊
CryptosTopics
影片
Top News
Cryptos
主題
Cryptospedia
資訊
CryptosTopics
影片
bitcoin
bitcoin

$86016.827096 USD

-3.42%

ethereum
ethereum

$2129.471540 USD

-3.13%

tether
tether

$0.999844 USD

-0.03%

xrp
xrp

$2.328702 USD

-8.44%

bnb
bnb

$595.845758 USD

-0.82%

solana
solana

$137.920269 USD

-4.71%

usd-coin
usd-coin

$0.999995 USD

-0.01%

dogecoin
dogecoin

$0.194781 USD

-3.73%

cardano
cardano

$0.809126 USD

-8.20%

tron
tron

$0.250091 USD

3.31%

pi
pi

$1.801049 USD

0.03%

chainlink
chainlink

$15.303441 USD

-10.54%

hedera
hedera

$0.227466 USD

-10.38%

unus-sed-leo
unus-sed-leo

$9.837554 USD

-0.88%

stellar
stellar

$0.276271 USD

-8.05%

加密貨幣新聞文章

EditRetro:將逆合成預測重新定義為分子串編輯任務

2024/07/31 03:06

設計分子的合成反應路徑是有機合成的一個基本面,對生物醫學、製藥和材料工業等各個領域具有重要意義。逆合成分析是開發合成路線最廣泛使用的方法。

EditRetro:將逆合成預測重新定義為分子串編輯任務

Organic synthesis plays a pivotal role in various fields, including biomedical, pharmaceutical, and materials industries. Retrosynthetic analysis serves as the primary approach for designing synthetic routes, aiming to decompose molecules into simpler precursors using established reactions. This methodology, initially formalized by Corey, led to the development of computer-aided synthesis planning (CASP). In recent years, artificial intelligence (AI)-driven retrosynthesis has facilitated the exploration of more complex molecules and significantly reduced the time and energy required to design synthetic experiments. Single-step retrosynthesis prediction is a crucial component of retrosynthetic planning, and several deep learning-based methods have been proposed with promising results. These methods can be broadly categorized into three groups: template-based, template-free, and semi-template-based methods.

有機合成在生物醫學、製藥和材料工業等各領域發揮著舉足輕重的作用。逆合成分析是設計合成路線的主要方法,旨在利用已建立的反應將分子分解成更簡單的前驅物。這種方法最初由 Corey 正式提出,導致了電腦輔助合成規劃 (CASP) 的發展。近年來,人工智慧(AI)驅動的逆合成促進了對更複雜分子的探索,並顯著減少了設計合成實驗所需的時間和精力。單步驟逆向綜合預測是逆向綜合規劃的重要組成部分,並且已經提出了幾種基於深度學習的方法,並且取得了可喜的結果。這些方法可以大致分為三組:基於模板的方法、無模板的方法和半基於模板的方法。

Template-based methods regard retrosynthesis prediction as a template retrieval problem and compare the target molecule with precomputed templates. These templates capture the essential features of the reaction center in specific types of chemical reactions. They can be generated manually or automatically and serve as a guide for the model to identify the most suitable chemical transformation for a given molecule. Various works have proposed different approaches to prioritize candidate templates. RetroSim employed the molecular fingerprint similarity between the given product and the molecules present in the corpus to rank the candidate templates. NeuralSym was the pioneering work to utilize deep neural networks for template selection by learning a multi-class classifier. GLN built a conditional graph logic network to learn the conditional joint probability of templates and reactants. LocalRetro conducted an evaluation of the suitability of local atom/bond templates at all predicted reaction centers for a target molecule and incorporated the non-local effects in chemical reactions through global reactivity attention. It has demonstrated state-of-the-art performance within the template-based methods. Although providing interpretability and molecule validity, template-based models suffer from limited generalization and scalability issues, which can hinder their practical utility.

基於模板的方法將逆合成預測視為模板檢索問題,並將目標分子與預先計算的模板進行比較。這些模板可捕捉特定類型化學反應中反應中心的基本特徵。它們可以手動或自動生成,並作為模型的指南來識別給定分子的最合適的化學轉化。各種工作提出了不同的方法來確定候選模板的優先順序。 RetroSim 利用給定產品和語料庫中存在的分子之間的分子指紋相似性來對候選模板進行排名。 NeuralSym 是透過學習多類分類器來利用深度神經網路進行模板選擇的開創性工作。 GLN 建構了一個條件圖邏輯網路來學習模板和反應物的條件聯合機率。 LocalRetro 對目標分子的所有預測反應中心的局部原子/鍵模板的適用性進行了評估,並透過全局反應性關注將非局部效應納入化學反應中。它在基於模板的方法中展示了最先進的性能。儘管提供了可解釋性和分子有效性,但基於模板的模型存在有限的泛化性和可擴展性問題,這可能會阻礙其實際應用。

Template-free methods utilize deep generative models to generate reactant molecules without relying on predefined templates. Most of existing methods reformulate the task as a sequence-to-sequence problem, employing the sequence representation of molecules, specifically the simplified molecular-input line-entry system (SMILES). Liu et al. first utilized a long short-term memory (LSTM)-based sequence-to-sequence (Seq2Seq) model to convert the SMILES representation of a product to the SMILES of the reactants. Karpov et al. further proposed a Transformer-based Seq2Seq method for retrosynthesis. SCROP integrated a grammar corrector into the Transformer architecture, aiming to resolve the prevalent problem of grammatical invalidity in seq2seq methods. R-SMILES established a closely aligned one-to-one mapping between the SMILES representations of the products and the reactants to enhance the efficiency of synthesis prediction in Transformer-based methods. PMSR devised three tailored pre-training tasks for retrosynthesis, encompassing auto-regression, molecule recovery, and contrastive reaction classification, thereby enhancing the performance of retrosynthesis and achieving state-of-the-art accuracy within template-free methods. Some studies characterize the task as a graph-to-sequence problem, employing the molecular graph as input. Graph2SMILES integrated a sequential graph encoder with a Transformer decoder to preserve the permutation invariance of SMILES. Retroformer introduced a local attention head in the Transformer encoder to augment its reasoning capability for reactions. Recent studies, including MEGAN, MARS, and Graph2Edits, have explored the utilization of end-to-end molecular graph editing models to represent a chemical reaction as a series of graph edits, drawing inspiration from the arrow pushing formalism. However, these approaches usually require time-consuming predictions for sequential graph edit operations. Fang et al. developed a substructure-level decoding method by automatically extracting commonly preserved portions of product molecules. However, the extraction of substructures is fully data-driven, and its coverage depends on the reaction dataset. Furthermore, incorrect substructures can lead to erroneous predictions. While template-free methods are fully data-driven, they raise concerns regarding the interpretability, chemical validity, and diversity of the generated molecules.

無模板方法利用深度生成模型來產生反應物分子,而不依賴預先定義的模板。大多數現有方法將任務重新表述為序列到序列問題,採用分子的序列表示,特別是簡化的分子輸入行輸入系統(SMILES)。劉等人。首先利用基於長短期記憶 (LSTM) 的序列到序列 (Seq2Seq) 模型將產物的 SMILES 表示轉換為反應物的 SMILES。卡爾波夫等人。進一步提出了一種基於 Transformer 的 Seq2Seq 逆合成方法。 SCROP 將語法校正器整合到 Transformer 架構中,旨在解決 seq2seq 方法中普遍存在的語法無效問題。 R-SMILES 在產物和反應物的 SMILES 表示之間建立了緊密一致的一對一映射,以提高基於 Transformer 的方法中合成預測的效率。 PMSR 設計了三個量身定制的逆合成預訓練任務,包括自回歸、分子恢復和對比反應分類,從而提高逆合成的性能並在無模板方法中實現最先進的準確性。一些研究將任務描述為圖到序列問題,並採用分子圖作為輸入。 Graph2SMILES 將順序圖編碼器與 Transformer 解碼器整合在一起,以保持 SMILES 的排列不變性。 Retroformer 在 Transformer 編碼器中引入了局部注意力頭,以增強其反應推理能力。最近的研究,包括 MEGAN、MARS 和 Graph2Edits,探索了利用端到端分子圖編輯模型將化學反應表示為一系列圖編輯,從推動形式主義的箭頭中汲取靈感。然而,這些方法通常需要對順序圖編輯操作進行耗時的預測。方等人。透過自動提取產品分子的共同保留部分,開發了一種子結構級解碼方法。然而,子結構的提取完全是數據驅動的,其覆蓋範圍取決於反應數據集。此外,不正確的子結構可能會導致錯誤的預測。雖然無模板方法完全由數據驅動,但它們引起了對所生成分子的可解釋性、化學有效性和多樣性的擔憂。

Semi-template-based methods leverage the benefits of the two aforementioned methods. These methods follow a two-stage procedure: first, fragmenting the target molecule into synthons by identifying reactive sites, and subsequently converting the synthons into reactants using techniques such as leaving groups selection, graph generation, or SMILES generation. RetroXpert first identified the reaction center of the target molecule to obtain synthons by employing an edge-enhanced graph attention network, followed by the generation of the corresponding reactants based on the synthons. RetroPrime introduced the mix-and-match and label-and-align strategies within a Transformer-based two-stage workflow to mitigate the challenges of insufficient diversity and chemical implausibility. G2Gs initially partitioned the target molecular graph into several synthons by identifying potential reaction centers, followed by the translation of the synthons into the complete reactant graphs using a variational graph translation framework. GraphRetro first transformed the target into synthons by

半基於模板的方法利用了上述兩種方法的優點。這些方法遵循兩階段程序:首先,透過識別反應位點將目標分子片段化為合成子,然後使用離去基團選擇、圖形生成或 SMILES 生成等技術將合成子轉化為反應物。 RetroXpert首先利用邊緣增強圖注意力網絡來辨識目標分子的反應中心以獲得合成子,然後根據合成子產生對應的反應物。 RetroPrime 在基於 Transformer 的兩階段工作流程中引入了混合匹配和標籤對齊策略,以緩解多樣性不足和化學不可信的挑戰。 G2Gs 最初透過識別潛在的反應中心將目標分子圖劃分為多個合成子,然後使用變分圖翻譯框架將合成子翻譯成完整的反應物圖。 GraphRetro 首先將目標轉化為合成子

免責聲明:info@kdj.com

所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!

如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。

2025年03月09日 其他文章發表於