RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis
Robin Yadav, Qi Yan, Guy Wolf, Avishek Joey Bose, Renjie Liao

TL;DR
RETRO SYNFLOW introduces a novel discrete flow-matching framework for single-step retrosynthesis, significantly improving accuracy and diversity of predicted reactions by leveraging reaction center identification and steering techniques.
Contribution
It presents a new discrete flow-matching approach with reaction center-based intermediate structures and inference steering, advancing the state-of-the-art in single-step retrosynthesis prediction.
Findings
Achieves 60.0% top-1 accuracy, outperforming previous methods by 20%.
FK-steering improves top-5 round-trip accuracy by 19%.
Maintains competitive top-k accuracy while enhancing diversity and feasibility.
Abstract
A fundamental problem in organic chemistry is identifying and predicting the series of reactions that synthesize a desired target product molecule. Due to the combinatorial nature of the chemical search space, single-step reactant prediction -- i.e. single-step retrosynthesis -- remains challenging even for existing state-of-the-art template-free generative approaches to produce an accurate yet diverse set of feasible reactions. In this paper, we model single-step retrosynthesis planning and introduce RETRO SYNFLOW (RSF) a discrete flow-matching framework that builds a Markov bridge between the prescribed target product molecule and the reactant molecule. In contrast to past approaches, RSF employs a reaction center identification step to produce intermediate structures known as synthons as a more informative source distribution for the discrete flow. To further enhance diversity and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Innovative Microfluidic and Catalytic Techniques Innovation · Computational Drug Discovery Methods
MethodsSparse Evolutionary Training
