TSSR: Two-Stage Swap-Reward-Driven Reinforcement Learning for Character-Level SMILES Generation
Jacob Ede Levine, Yun Lyan Luo, Sai Chandra Kosaraju

TL;DR
TSSR is a novel two-stage reinforcement learning framework that improves the validity, diversity, and chemical correctness of character-level SMILES molecule generation by combining syntax repair and chemistry-aware feedback.
Contribution
It introduces a two-stage reward-driven RL method that enhances molecular SMILES generation without task-specific labels or handcrafted grammars.
Findings
Significantly improves syntactic and chemical validity of generated molecules
Preserves drug-likeness and synthesizability while increasing diversity
Enhances molecule validity and novelty in both pure and fine-tuned RL settings
Abstract
The design of reliable, valid, and diverse molecules is fundamental to modern drug discovery, as improved molecular generation supports efficient exploration of the chemical space for potential drug candidates and reduces the cost of early design efforts. Despite these needs, current chemical language models that generate molecules as SMILES strings are vulnerable to compounding token errors: many samples are unparseable or chemically implausible, and hard constraints meant to prevent failure can restrict exploration. To address this gap, we introduce TSSR, a Two-Stage, Swap-Reward-driven reinforcement learning (RL) framework for character-level SMILES generation. Stage one rewards local token swaps that repair syntax, promoting transitions from invalid to parseable strings. Stage two provides chemistry-aware feedback from RDKit diagnostics, rewarding reductions in valence, aromaticity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Chemical Synthesis and Analysis
