MolRecBench-Wild: A Real-World Benchmark for Optical Chemical Structure Recognition

Haote Yang; Hui Wang; Chen Zhu; Jingchao Wang; Linye Li; Hongbin Lai; Huijie Ao; Yongxuan Lyu; Jiang Wu; Jiaxing Sun; Lua Chen; Yuanyuan Cao; Ruijie Zhang; Shengxin Lu; Lijun Wu; Bin Wang; Conghui He

arXiv:2605.05832·cs.AI·May 8, 2026

MolRecBench-Wild: A Real-World Benchmark for Optical Chemical Structure Recognition

Haote Yang, Hui Wang, Chen Zhu, Jingchao Wang, Linye Li, Hongbin Lai, Huijie Ao, Yongxuan Lyu, Jiang Wu, Jiaxing Sun, Lua Chen, Yuanyuan Cao, Ruijie Zhang, Shengxin Lu, Lijun Wu, Bin Wang, Conghui He

PDF

TL;DR

This paper introduces MolRecBench-Wild, a comprehensive real-world benchmark for Optical Chemical Structure Recognition, along with a new difficulty framework and a semantic representation language, revealing significant challenges for existing models.

Contribution

It presents MolRecBench-Wild, a large-scale real-world benchmark with a novel difficulty framework and a semantic language, advancing evaluation of OCSR systems in practical scenarios.

Findings

01

Models perform poorly on MolRecBench-Wild, indicating a large gap with real-world applications.

02

The MOSAIC difficulty framework effectively characterizes challenges in molecular diagram recognition.

03

The CARBON language enables more faithful semantic representation of chemical structures.

Abstract

Optical Chemical Structure Recognition (OCSR) aims to translate molecular diagrams in scientific literature into machine-readable formats, but current systems remain unreliable on real-world images due to substantial visual and chemical complexity. We introduce MOSAIC, a dual-dimensional difficulty framework with 37 fine-grained labels that jointly characterize visual interference and chemical semantic challenges in molecular diagrams. Based on this framework, we construct MolRecBench-Wild, a benchmark of 5,029 structures from 820 recent chemistry papers, covering the full difficulty spectrum observed in real publications. To enable faithful semantic evaluation beyond SMILES and MolFile, we propose CARBON, a representation language capable of expressing valence variations, icon-based groups, and other non-standard chemical semantics. We further adopt a dual-track evaluation protocol…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.