Bridging the Plausibility-Validity Gap by Fine-Tuning a Reasoning-Enhanced LLM for Chemical Synthesis and Discovery

Malikussaid; Hilal Hudan Nuha; Isman Kurniawan

arXiv:2507.07328·cs.LG·January 7, 2026

Bridging the Plausibility-Validity Gap by Fine-Tuning a Reasoning-Enhanced LLM for Chemical Synthesis and Discovery

Malikussaid, Hilal Hudan Nuha, Isman Kurniawan

PDF

TL;DR

This paper introduces a fine-tuned, reasoning-enhanced large language model for chemistry that significantly improves the validity and feasibility of generated chemical synthesis pathways, addressing the plausibility-validity gap.

Contribution

It presents a novel combination of reasoning-centric architecture and Low-Rank Adaptation fine-tuning on dual-domain chemical data, outperforming existing models in validity and synthesis feasibility.

Findings

01

Achieves 96.3% format adherence and 97.4% chemical validity.

02

Outperforms MolT5 in validity (97.4% vs 77.2%).

03

Comparable to expert-rated systems like ChemCrow.

Abstract

Large Language Models frequently generate outputs that appear scientifically reasonable yet violate fundamental principles--a phenomenon we characterize as the "plausibility-validity gap." This challenge proves especially acute in chemistry, where superficial correctness masks deeper errors in molecular structure, reaction mechanisms, and synthetic pathways. We present a systematic approach combining a reasoning-centric model architecture (Magistral Small) with Low-Rank Adaptation fine-tuning on a dual-domain dataset covering molecular properties and chemical transformations. Evaluation reveals substantial improvements: the fine-tuned system achieves 96.3% format adherence, 97.4% chemical validity, and 74.4% synthesis feasibility. Comparative analysis shows our approach outperforms specialized translation models like MolT5 (97.4% vs 77.2% validity) while achieving performance comparable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.