Chemist-aligned retrosynthesis by ensembling diverse inductive bias models
Krzysztof Maziarz, Guoqing Liu, Hubert Misztela, Austin Tripp, Junren Li, Aleksei Kornev, Piotr Gai\'nski, Holger Hoefling, Mike Fortunato, Rishi Gupta, Marwin Segler

TL;DR
RetroChimera is a novel ensembling-based retrosynthesis model that combines diverse inductive biases, outperforming existing models in accuracy, robustness, and alignment with chemists' expectations, even with limited data and across distribution shifts.
Contribution
The paper introduces RetroChimera, a new framework that ensembles multiple inductive bias models for retrosynthesis, improving accuracy, robustness, and alignment with chemists' expectations.
Findings
Outperforms major models across various data scales and splits.
Demonstrates robustness outside training data and with small sample sizes.
Achieves high agreement with industrial chemists and generalizes to external datasets.
Abstract
Chemical synthesis remains a critical bottleneck in the discovery and manufacture of functional small molecules. AI-based synthesis planning models could be a potential remedy to find effective syntheses, and have made progress in recent years. However, they still struggle with less frequent, yet critical reactions for synthetic strategy, as well as hallucinated, incorrect predictions. This hampers multi-step search algorithms that rely on models, and leads to misalignment with chemists' expectations. Here we propose RetroChimera: a frontier retrosynthesis model, built upon two newly developed components with complementary inductive biases, which we fuse together using a new framework for integrating predictions from multiple sources via a learning-based ensembling strategy. Through experiments across several orders of magnitude in data scale and splitting strategy, we show RetroChimera…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Chimera: Accurate synthesis prediction by ensembling models with... | Microsoft Research Forum· youtube
Taxonomy
TopicsGraph Theory and Algorithms · Machine Learning in Materials Science
MethodsChimera
