Protect$^*$: Steerable Retrosynthesis through Neuro-Symbolic State Encoding

Shreyas Vinaya Sathyanarayana; Shah Rahil Kirankumar; Sharanabasava D. Hiremath; Bharath Ramsundar

arXiv:2602.13419·q-bio.QM·February 17, 2026

Protect$^*$: Steerable Retrosynthesis through Neuro-Symbolic State Encoding

Shreyas Vinaya Sathyanarayana, Shah Rahil Kirankumar, Sharanabasava D. Hiremath, Bharath Ramsundar

PDF

Open Access

TL;DR

Protect$^*$ is a neuro-symbolic framework that enhances LLM-based retrosynthesis by integrating chemical logic and expert constraints, enabling more reliable and controllable synthetic pathway generation.

Contribution

It introduces a hybrid neuro-symbolic system combining rule-based reasoning with neural models for chemically guided retrosynthesis.

Findings

01

Successfully applied to complex natural products

02

Enabled discovery of a novel Erythromycin B pathway

03

Improved control and reliability in retrosynthesis generation

Abstract

Large Language Models (LLMs) have shown remarkable potential in scientific domains like retrosynthesis; yet, they often lack the fine-grained control necessary to navigate complex problem spaces without error. A critical challenge is directing an LLM to avoid specific, chemically sensitive sites on a molecule - a task where unconstrained generation can lead to invalid or undesirable synthetic pathways. In this work, we introduce Protect $^{*}$ , a neuro-symbolic framework that grounds the generative capabilities of Large Language Models (LLMs) in rigorous chemical logic. Our approach combines automated rule-based reasoning - using a comprehensive database of 55+ SMARTS patterns and 40+ characterized protecting groups - with the generative intuition of neural models. The system operates via a hybrid architecture: an ``automatic mode'' where symbolic logic deterministically identifies and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Topic Modeling