Root-aligned SMILES: A Tight Representation for Chemical Reaction Prediction
Zipeng Zhong, Jie Song, Zunlei Feng, Tiantao Liu, Lingxiang Jia,, Shaolun Yao, Min Wu, Tingjun Hou, Mingli Song

TL;DR
This paper introduces root-aligned SMILES (R-SMILES), a new molecule representation that improves chemical reaction prediction by aligning reactants and products more closely, leading to better model performance.
Contribution
The paper proposes R-SMILES, a tightly aligned SMILES representation that enhances reaction prediction accuracy by simplifying the learning task for models.
Findings
R-SMILES significantly outperforms state-of-the-art baselines.
The strict one-to-one mapping reduces model complexity.
R-SMILES improves prediction efficiency and accuracy.
Abstract
Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule representations. However, the general-purpose SMILES neglects the characteristics of chemical reactions, where the molecular graph topology is largely unaltered from reactants to products, resulting in the suboptimal performance of SMILES if straightforwardly applied. In this article, we propose the root-aligned SMILES (R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES for more efficient synthesis prediction. Due to the strict one-to-one mapping and reduced edit distance, the computational model is largely relieved from learning the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
