Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages
Michael Sun, Weize Yuan, Gang Liu, Wojciech Matusik, Jie Chen

TL;DR
This paper introduces Foundation Molecular Grammar (FMG), a novel approach that uses multi-modal foundation models to induce interpretable molecular languages, improving molecular generation and property prediction with enhanced interpretability and efficiency.
Contribution
FMG leverages multi-modal foundation models to induce interpretable molecular grammars without expert annotations, enabling better molecular generation and discovery.
Findings
FMG improves synthesizability and diversity of generated molecules.
FMG enhances data efficiency in molecular tasks.
FMG provides built-in chemical interpretability.
Abstract
Recent data-efficient molecular generation approaches exploit graph grammars to introduce interpretability into the generative models. However, grammar learning therein relies on expert annotation or unreliable heuristics for algorithmic inference. We propose Foundation Molecular Grammar (FMG), which leverages multi-modal foundation models (MMFMs) to induce an interpretable molecular language. By exploiting the chemical knowledge of an MMFM, FMG renders molecules as images, describes them as text, and aligns information across modalities using prompt learning. FMG can be used as a drop-in replacement for the prior grammar learning approaches in molecular generation and property prediction. We show that FMG not only excels in synthesizability, diversity, and data efficiency but also offers built-in chemical interpretability for automated molecular discovery workflows. Code is available…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Biomedical Text Mining and Ontologies
