ChemiRise: a data-driven retrosynthesis engine
Xiangyan Sun, Ke Liu, Yuquan Lin, Lingjie Wu, Haoming Xing, Minghong, Gao, Ji Liu, Suocheng Tan, Zekun Ni, Qi Han, Junqiu Wu, Jie Fan

TL;DR
ChemiRise is an advanced, data-driven retrosynthesis system that rapidly proposes reliable routes for organic compounds, leveraging a large patent database, graph neural networks, and a guiding algorithm, with promising results validated by experts.
Contribution
The paper introduces ChemiRise, a novel end-to-end retrosynthesis engine that combines a large reaction database, graph convolutional neural networks, and a DAG-guided search algorithm, improving upon previous methods.
Findings
Atom-mapping algorithm outperforms previous methods
Graph neural network-based reaction proposer is more accurate
Retrosynthesis routes are validated by human experts
Abstract
We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-based one-step reaction proposer using template embeddings and developed a guiding algorithm on the directed acyclic graph (DAG) of chemical compounds to find the best candidate to explore. The atom-mapping algorithm and the one-step reaction proposer were benchmarked against previous studies and showed better results. The final product was demonstrated by retrosynthesis routes reviewed and rated by human experts, showing satisfying functionality and a potential productivity boost in real-life…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Innovative Microfluidic and Catalytic Techniques Innovation · Computational Drug Discovery Methods
