Mind the Retrosynthesis Gap: Bridging the divide between Single-step and Multi-step Retrosynthesis Prediction
Alan Kai Hassen, Paula Torren-Peraire, Samuel Genheden, Jonas, Verhoeven, Mike Preuss, Igor Tetko

TL;DR
This paper investigates how single-step retrosynthesis models can be effectively transferred to multi-step synthesis planning, demonstrating significant performance improvements and highlighting the need for joint evaluation metrics.
Contribution
It establishes a bridge between single-step and multi-step retrosynthesis, benchmarking transferability, and emphasizing the importance of multi-step evaluation metrics.
Findings
Single-step models can improve multi-step route finding by up to 30%.
No clear correlation exists between single-step and multi-step evaluation metrics.
Developing models specifically for multi-step retrosynthesis is necessary.
Abstract
Retrosynthesis is the task of breaking down a chemical compound recursively step-by-step into molecular precursors until a set of commercially available molecules is found. Consequently, the goal is to provide a valid synthesis route for a molecule. As more single-step models develop, we see increasing accuracy in the prediction of molecular disconnections, potentially improving the creation of synthetic paths. Multi-step approaches repeatedly apply the chemical information stored in single-step retrosynthesis models. However, this connection is not reflected in contemporary research, fixing either the single-step model or the multi-step algorithm in the process. In this work, we establish a bridge between both tasks by benchmarking the performance and transfer of different single-step retrosynthesis models to the multi-step domain by leveraging two common search algorithms, Monte Carlo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Chemical Synthesis and Analysis
