DiffER: Categorical Diffusion for Chemical Retrosynthesis
Sean Current, Ziqi Chen, Daniel Adu-Ampratwum, Xia Ning, Srinivasan Parthasarathy

TL;DR
DiffER introduces a novel template-free categorical diffusion approach for chemical retrosynthesis, predicting entire SMILES sequences simultaneously and achieving state-of-the-art accuracy among non-template methods.
Contribution
The paper presents DiffER, a new diffusion-based model for retrosynthesis that outperforms existing template-free methods and includes a length prediction component for improved accuracy.
Findings
DiffER achieves state-of-the-art top-1 accuracy among template-free methods.
The ensemble of diffusion models improves confidence and likelihood in predictions.
Accurate SMILES length prediction is crucial for model performance.
Abstract
Methods for automatic chemical retrosynthesis have found recent success through the application of models traditionally built for natural language processing, primarily through transformer neural networks. These models have demonstrated significant ability to translate between the SMILES encodings of chemical products and reactants, but are constrained as a result of their autoregressive nature. We propose DiffER, an alternative template-free method for retrosynthesis prediction in the form of categorical diffusion, which allows the entire output SMILES sequence to be predicted in unison. We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy among template-free methods. We prove that DiffER is a strong baseline for a new class of template-free model, capable of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Materials Characterization Techniques
MethodsDiffusion
