A Scientific Reasoning Model for Organic Synthesis Procedure Generation
Guoqing Liu, Junren Li, Zihan Zhao, Eray Inanc, Krzysztof Maziarz, Jose Garrido Torres, Victor Garcia Satorras, Shoko Ueda, Christopher M. Bishop, Marwin Segler

TL;DR
This paper introduces QFANG, a scientific reasoning model that generates detailed experimental procedures for chemical synthesis from reaction equations, integrating chemical knowledge and reasoning to improve automated synthesis planning.
Contribution
The paper presents QFANG, a novel language model with a Chemistry-Guided Reasoning framework, trained on a large dataset, capable of producing accurate, structured synthesis procedures with chain-of-thought reasoning.
Findings
QFANG outperforms baseline models in accuracy and relevance.
The model generalizes to some out-of-domain reactions.
Reinforcement learning further improves procedural accuracy.
Abstract
Solving computer-aided synthesis planning is essential for enabling fully automated, robot-assisted synthesis workflows and improving the efficiency of drug discovery. A key challenge, however, is bridging the gap between computational route design and practical laboratory execution, particularly the accurate prediction of viable experimental procedures for each synthesis step. In this work, we present QFANG, a scientific reasoning language model capable of generating precise, structured experimental procedures directly from reaction equations, with explicit chain-of-thought reasoning. To develop QFANG, we curated a high-quality dataset comprising 905,990 chemical reactions paired with structured action sequences, extracted and processed from patent literature using large language models. We introduce a Chemistry-Guided Reasoning (CGR) framework that produces chain-of-thought data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · AI-based Problem Solving and Planning
