Modeling Diverse Chemical Reactions for Single-step Retrosynthesis via Discrete Latent Variables
Huarui He, Jie Wang, Yunfei Liu, Feng Wu

TL;DR
This paper introduces RetroDVCAE, a novel sequence-based model using discrete latent variables and Gumbel-Softmax to generate diverse reactant sets for single-step retrosynthesis, improving reaction diversity over existing methods.
Contribution
The work presents RetroDVCAE, a new approach integrating conditional variational autoencoders with discrete latent variables to enhance diversity in retrosynthesis predictions.
Findings
RetroDVCAE outperforms state-of-the-art models on benchmark datasets.
The model effectively captures multi-modal reaction distributions.
It generates diverse reactant candidates for a given product.
Abstract
Single-step retrosynthesis is the cornerstone of retrosynthesis planning, which is a crucial task for computer-aided drug discovery. The goal of single-step retrosynthesis is to identify the possible reactants that lead to the synthesis of the target product in one reaction. By representing organic molecules as canonical strings, existing sequence-based retrosynthetic methods treat the product-to-reactant retrosynthesis as a sequence-to-sequence translation problem. However, most of them struggle to identify diverse chemical reactions for a desired product due to the deterministic inference, which contradicts the fact that many compounds can be synthesized through various reaction types with different sets of reactants. In this work, we aim to increase reaction diversity and generate various reactants using discrete latent variables. We propose a novel sequence-based approach, namely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Chemical Synthesis and Analysis
