Order Matters in Retrosynthesis: Structure-aware Generation via Reaction-Center-Guided Discrete Flow Matching
Chenguang Wang, Zihan Zhou, Lei Bai, Tianshu Yu

TL;DR
This paper introduces a structure-aware, template-free retrosynthesis framework that leverages atom ordering and reaction center positioning to improve prediction accuracy and efficiency, outperforming existing models with less data and fewer steps.
Contribution
It proposes a novel structure-aware approach using reaction-center-guided positional encoding and a graph transformer backbone for more accurate and efficient retrosynthesis prediction.
Findings
Achieves state-of-the-art top-1 accuracy on USPTO datasets.
Generates predictions in 20-50 steps, significantly fewer than prior diffusion methods.
Structural priors outperform brute-force scaling, enabling smaller models to match larger ones.
Abstract
Template-free retrosynthesis methods treat the task as black-box sequence generation, limiting learning efficiency, while semi-template approaches rely on rigid reaction libraries that constrain generalization. We address this gap with a key insight: atom ordering in neural representations matters. Building on this insight, we propose a structure-aware template-free framework that encodes the two-stage nature of chemical reactions as a positional inductive bias. By placing reaction center atoms at the sequence head, our method transforms implicit chemical knowledge into explicit positional patterns that the model can readily capture. The proposed RetroDiT backbone, a graph transformer with rotary position embeddings, exploits this ordering to prioritize chemically critical regions. Combined with discrete flow matching, our approach decouples training from sampling and enables generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Topic Modeling
