ChemVTS-Bench: Evaluating Visual-Textual-Symbolic Reasoning of Multimodal Large Language Models in Chemistry
Zhiyuan Huang, Baichuan Yang, Zikun He, Yanhong Wu, Fang Hongyu, Zhenhe Liu, Lin Dongsheng, Bing Su

TL;DR
ChemVTS-Bench is a comprehensive benchmark designed to evaluate the multimodal reasoning capabilities of large language models in chemistry, focusing on visual, textual, and symbolic modalities across diverse chemical problems.
Contribution
This work introduces ChemVTS-Bench, a domain-specific benchmark with an automated evaluation workflow, enabling detailed analysis of multimodal reasoning in chemical contexts.
Findings
Visual-only inputs are challenging for models.
Structural chemistry is the most difficult domain.
Multimodal fusion reduces but does not eliminate errors.
Abstract
Chemical reasoning inherently integrates visual, textual, and symbolic modalities, yet existing benchmarks rarely capture this complexity, often relying on simple image-text pairs with limited chemical semantics. As a result, the actual ability of Multimodal Large Language Models (MLLMs) to process and integrate chemically meaningful information across modalities remains unclear. We introduce \textbf{ChemVTS-Bench}, a domain-authentic benchmark designed to systematically evaluate the Visual-Textual-Symbolic (VTS) reasoning abilities of MLLMs. ChemVTS-Bench contains diverse and challenging chemical problems spanning organic molecules, inorganic materials, and 3D crystal structures, with each task presented in three complementary input modes: (1) visual-only, (2) visual-text hybrid, and (3) SMILES-based symbolic input. This design enables fine-grained analysis of modality-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Multimodal Machine Learning Applications · Computational Drug Discovery Methods
