ChemDFM-X: Towards Large Multimodal Model for Chemistry
Zihan Zhao, Bo Chen, Jingpiao Li, Lu Chen, Liyang Wen, Pengyu Wang,, Zichen Zhu, Danyang Zhang, Ziping Wan, Yansi Li, Zhongyang Dai, Xin Chen and, Kai Yu

TL;DR
ChemDFM-X is a pioneering large multimodal model designed for chemistry, capable of understanding and integrating diverse chemical data modalities, thus advancing the development of comprehensive chemical general intelligence.
Contribution
This work introduces the first cross-modal dialogue foundation model for chemistry, ChemDFM-X, with a large instruction-tuning dataset, enabling effective multimodal chemical task performance.
Findings
Demonstrates ChemDFM-X's ability to understand multiple chemical data modalities.
Achieves significant progress in aligning chemical data modalities.
Shows potential as a research assistant in chemical science.
Abstract
Rapid developments of AI tools are expected to offer unprecedented assistance to the research of natural science including chemistry. However, neither existing unimodal task-specific specialist models nor emerging general large multimodal models (LMM) can cover the wide range of chemical data modality and task categories. To address the real demands of chemists, a cross-modal Chemical General Intelligence (CGI) system, which serves as a truly practical and useful research assistant utilizing the great potential of LMMs, is in great need. In this work, we introduce the first Cross-modal Dialogue Foundation Model for Chemistry (ChemDFM-X). Diverse multimodal data are generated from an initial modality by approximate calculations and task-specific model predictions. This strategy creates sufficient chemical training corpora, while significantly reducing excessive expense, resulting in an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Processing Techniques · Various Chemistry Research Topics · Computational Drug Discovery Methods
