MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language
Yoel Shoshan, Moshiko Raboh, Michal Ozery-Flato, Vadim Ratner, Alex, Golts, Jeffrey K. Weber, Ella Barkan, Simona Rabinovici-Cohen, Sagi Polaczek,, Ido Amos, Ben Shapira, Liam Hazan, Matan Ninio, Sivan Ravid, Michael M., Danziger, Yosi Shamay, Sharon Kurant, Joseph A. Morrone

TL;DR
MAMMAL is a multi-modal foundation model that integrates diverse biological data types to improve prediction and generation tasks in drug discovery, achieving state-of-the-art results across multiple benchmarks.
Contribution
The paper introduces MAMMAL, a versatile multi-task model that unifies biological data modalities for enhanced predictive capabilities, surpassing prior task-specific models.
Findings
Achieves SOTA in 9 out of 11 downstream tasks.
Demonstrates improved classification in antibody-antigen binding prediction.
Supports diverse biological data modalities within a single architecture.
Abstract
Large language models applied to vast biological datasets have the potential to transform biology by uncovering disease mechanisms and accelerating drug development. However, current models are often siloed, trained separately on small-molecules, proteins, or transcriptomic data, limiting their ability to capture complex, multi-modal interactions. Effective drug discovery requires computational tools that integrate multiple biological entities while supporting prediction and generation, a challenge existing models struggle to address. For this purpose, we present MAMMAL - Molecular Aligned Multi-Modal Architecture and Language - a versatile method applied to create a multi-task foundation model that learns from large-scale biological datasets across diverse modalities, including proteins, small-molecules, and omics. MAMMAL's structured prompt syntax supports classification, regression,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458mmodel· 203 dl· ♡ 27203 dl♡ 27
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.protein_solubilitymodel· 53 dl· ♡ 553 dl♡ 5
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.dti_bindingdb_pkdmodel· 32 dl· ♡ 232 dl♡ 2
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.tcr_epitope_bindmodel· 9 dl· ♡ 39 dl♡ 3
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.moleculenet_clintox_toxmodel· 3 dl3 dl
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.moleculenet_clintox_fdamodel· 5 dl5 dl
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.moleculenet_bbbpmodel· 6 dl· ♡ 16 dl♡ 1
- 🤗ibm-research/biomed.omics.bl.sm.ma-ted-458m.dti_bindingdb_pkd_peermodel· 2 dl2 dl
- 🤗introvoyz041/biomed.omics.bl.sm.ma-ted-458mmodel· 10 dl10 dl
- 🤗introvoyz041/biomed.omics.bl.sm.ma-ted-458m.protein_solubilitymodel· 10 dl10 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Diatoms and Algae Research · DNA and Biological Computing
MethodsAlphaFold
