MolSight: Optical Chemical Structure Recognition with SMILES Pretraining, Multi-Granularity Learning and Reinforcement Learning
Wenrui Zhang, Xinggang Wang, Bin Feng, Wenyu Liu

TL;DR
MolSight is a multi-stage learning framework that significantly improves optical chemical structure recognition, especially stereochemistry, by combining pretraining, multi-granularity fine-tuning, and reinforcement learning.
Contribution
This work introduces MolSight, a novel three-stage training paradigm with reinforcement learning and a new stereochemical dataset for enhanced OCSR accuracy.
Findings
MolSight achieves state-of-the-art results across multiple datasets.
Reinforcement learning further improves stereochemical recognition.
Multi-granularity fine-tuning enhances molecular formula accuracy.
Abstract
Optical Chemical Structure Recognition (OCSR) plays a pivotal role in modern chemical informatics, enabling the automated conversion of chemical structure images from scientific literature, patents, and educational materials into machine-readable molecular representations. This capability is essential for large-scale chemical data mining, drug discovery pipelines, and Large Language Model (LLM) applications in related domains. However, existing OCSR systems face significant challenges in accurately recognizing stereochemical information due to the subtle visual cues that distinguish stereoisomers, such as wedge and dash bonds, ring conformations, and spatial arrangements. To address these challenges, we propose MolSight, a comprehensive learning framework for OCSR that employs a three-stage training paradigm. In the first stage, we conduct pre-training on large-scale but noisy datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Molecular spectroscopy and chirality
