Learning to Make Generalizable and Diverse Predictions for Retrosynthesis
Benson Chen, Tianxiao Shen, Tommi S. Jaakkola, Regina Barzilay

TL;DR
This paper introduces a Transformer-based model with novel pre-training and latent variable techniques to improve the accuracy and diversity of retrosynthetic reaction predictions, demonstrating significant gains on the USPTO-50k benchmark.
Contribution
It presents a new model architecture with pre-training strategies and diversity-promoting mechanisms specifically designed for retrosynthesis prediction.
Findings
Significantly outperforms baseline models on USPTO-50k dataset.
Produces more diverse reaction predictions.
Enhances generalizability of retrosynthesis models.
Abstract
We propose a new model for making generalizable and diverse retrosynthetic reaction predictions. Given a target compound, the task is to predict the likely chemical reactants to produce the target. This generative task can be framed as a sequence-to-sequence problem by using the SMILES representations of the molecules. Building on top of the popular Transformer architecture, we propose two novel pre-training methods that construct relevant auxiliary tasks (plausible reactions) for our problem. Furthermore, we incorporate a discrete latent variable model into the architecture to encourage the model to produce a diverse set of alternative predictions. On the 50k subset of reaction examples from the United States patent literature (USPTO-50k) benchmark dataset, our model greatly improves performance over the baseline, while also generating predictions that are more diverse.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Topic Modeling
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
