MolBind: Multimodal Alignment of Language, Molecules, and Proteins
Teng Xiao, Chao Cui, Huaisheng Zhu, and Vasant G. Honavar

TL;DR
MolBind is a novel multi-modal framework that aligns language, molecules, and proteins in a shared space, improving zero-shot learning in drug discovery by integrating four modalities with a new dataset.
Contribution
The paper introduces MolBind, a multi-modal contrastive learning framework that unifies diverse biological data modalities and provides a high-quality dataset for pre-training.
Findings
Superior zero-shot performance across multiple tasks
Effective semantic alignment of diverse modalities
Introduction of a new multi-modal dataset MolBind-M4
Abstract
Recent advancements in biology and chemistry have leveraged multi-modal learning, integrating molecules and their natural language descriptions to enhance drug discovery. However, current pre-training frameworks are limited to two modalities, and designing a unified network to process different modalities (e.g., natural language, 2D molecular graphs, 3D molecular conformations, and 3D proteins) remains challenging due to inherent gaps among them. In this work, we propose MolBind, a framework that trains encoders for multiple modalities through contrastive learning, mapping all modalities to a shared feature space for multi-modal semantic alignment. To facilitate effective pre-training of MolBind on multiple modalities, we also build and collect a high-quality dataset with four modalities, MolBind-M4, including graph-language, conformation-language, graph-conformation, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
