TinyChemVL: Advancing Chemical Vision-Language Models via Efficient Visual Token Reduction and Complex Reaction Tasks
Xuanle Zhao, Shuxin Zeng, Xinyuan Cai, Xiang Cheng, Duzhen Zhang, Xiuyi Chen, Bo Xu

TL;DR
TinyChemVL is a new chemical vision-language model that uses visual token reduction and reaction-level tasks to improve efficiency and reasoning in chemical image understanding, outperforming previous models with fewer parameters.
Contribution
The paper introduces TinyChemVL, a novel chemical VLM that employs visual token reduction and reaction-level tasks, significantly enhancing efficiency and reasoning over prior models.
Findings
TinyChemVL achieves superior performance on molecular and reaction tasks.
It outperforms ChemVLM while using only 1/16th of visual tokens.
The model demonstrates faster inference and training speeds.
Abstract
While Vision Language Models (VLMs) have demonstrated remarkable capabilities in general visual understanding, their application in the chemical domain has been limited, with previous works predominantly focusing on text and thus overlooking critical visual information, such as molecular structures. Current approaches that directly adopt standard VLMs for chemical tasks suffer from two primary issues: (i) computational inefficiency of processing entire chemical images with non-informative backgrounds. (ii) a narrow scope on molecular-level tasks that restricts progress in chemical reasoning. In this work, we propose \textbf{TinyChemVL}, an efficient and powerful chemical VLM that leverages visual token reduction and reaction-level tasks to improve model efficiency and reasoning capacity. Also, we propose \textbf{ChemRxn-V}, a reaction-level benchmark for assessing vision-based reaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning in Materials Science · Multimodal Machine Learning Applications · Computational Drug Discovery Methods
