Protect$^*$: Steerable Retrosynthesis through Neuro-Symbolic State Encoding
Shreyas Vinaya Sathyanarayana, Shah Rahil Kirankumar, Sharanabasava D. Hiremath, Bharath Ramsundar

TL;DR
Protect$^*$ is a neuro-symbolic framework that enhances LLM-based retrosynthesis by integrating chemical logic and expert constraints, enabling more reliable and controllable synthetic pathway generation.
Contribution
It introduces a hybrid neuro-symbolic system combining rule-based reasoning with neural models for chemically guided retrosynthesis.
Findings
Successfully applied to complex natural products
Enabled discovery of a novel Erythromycin B pathway
Improved control and reliability in retrosynthesis generation
Abstract
Large Language Models (LLMs) have shown remarkable potential in scientific domains like retrosynthesis; yet, they often lack the fine-grained control necessary to navigate complex problem spaces without error. A critical challenge is directing an LLM to avoid specific, chemically sensitive sites on a molecule - a task where unconstrained generation can lead to invalid or undesirable synthetic pathways. In this work, we introduce Protect, a neuro-symbolic framework that grounds the generative capabilities of Large Language Models (LLMs) in rigorous chemical logic. Our approach combines automated rule-based reasoning - using a comprehensive database of 55+ SMARTS patterns and 40+ characterized protecting groups - with the generative intuition of neural models. The system operates via a hybrid architecture: an ``automatic mode'' where symbolic logic deterministically identifies and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Topic Modeling
