Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models
Yi Xiong, Liang Xiong, Xiaohong Ji, Sen Yang, Zhifeng Gao, Huaimin Wang, Kele Xu

TL;DR
This paper introduces SCPT, a pipeline for scaffold-conditioned preference triplets that enables controllable, scaffold-preserving molecular optimization using large language models, improving success rates and property gains.
Contribution
The paper presents a novel scaffold-conditioned preference triplet construction method for training LLMs to perform controlled molecular edits that preserve scaffolds.
Findings
SCPT improves optimization success and property gains.
Models trained on limited supervision generalize well to multi-property tasks.
SCPT allows systematic control over similarity and property trade-offs.
Abstract
Molecular property optimization is central to drug discovery, yet many deep learning methods rely on black-box scoring and offer limited control over scaffold preservation, often producing unstable or biologically implausible edits. While large language models (LLMs) are promising molecular generators, optimization remains constrained by the lack of chemistry-grounded preference supervision and principled data curation. We introduce \textbf{Scaffold-Conditioned Preference Triplets (SCPT)}, a pipeline that constructs similarity-constrained triplets via scaffold alignment and chemistry-driven filters for validity, synthesizability, and meaningful property gains. Using these preferences, we align a pretrained molecular LLM as a conditional editor, enabling property-improving edits that retain the scaffold. Across single- and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
