Multi-view biomedical foundation models for molecule-target and property prediction
Parthasarathy Suryanarayanan, Yunguang Qiu, Shreyans Sethi, Diwakar Mahajan, Hongyang Li, Yuxin Yang, Elif Eyigoz, Aldo Guzman Saenz, Daniel E. Platt, Timothy H. Rumbell, Kenney Ng, Sanjoy Dey, Myson Burch, Bum Chul Kwon, Pablo Meyer, Feixiong Cheng, Jianying Hu

TL;DR
This paper introduces MMELON, a multi-view foundation model integrating graph, image, and text representations for molecules, achieving robust performance across diverse biomedical tasks and aiding drug discovery.
Contribution
The paper develops a novel multi-view molecular embedding approach that combines multiple representations in a foundation model, enhancing robustness and applicability in biomedical research.
Findings
Multi-view model matches top single-view performance.
Validated on 120+ tasks including solubility and GPCR activity.
Identified potential Alzheimer's-related GPCR targets.
Abstract
Quality molecular representations are key to foundation model development in bio-medical research. Previous efforts have typically focused on a single representation or molecular view, which may have strengths or weaknesses on a given task. We develop Multi-view Molecular Embedding with Late Fusion (MMELON), an approach that integrates graph, image and text views in a foundation model setting and may be readily extended to additional representations. Single-view foundation models are each pre-trained on a dataset of up to 200M molecules. The multi-view model performs robustly, matching the performance of the highest-ranked single-view. It is validated on over 120 tasks, including molecular solubility, ADME properties, and activity against G Protein-Coupled receptors (GPCRs). We identify 33 GPCRs that are related to Alzheimer's disease and employ the multi-view model to select strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ibm-research/biomed.sm.mv-te-84mmodel· 766 dl· ♡ 19766 dl♡ 19
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-BACE-101model· 1.0k dl· ♡ 31.0k dl♡ 3
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-MUV-101model· 13 dl· ♡ 213 dl♡ 2
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-SIDER-101model· 8 dl· ♡ 28 dl♡ 2
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-TOX21-101model· 6 dl· ♡ 16 dl♡ 1
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-LIPOPHILICITY-101model· 1.0k dl· ♡ 11.0k dl♡ 1
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-TOXCAST-101model· 3 dl· ♡ 13 dl♡ 1
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-CLINTOX-101model· 13 dl· ♡ 113 dl♡ 1
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-ESOL-101model· 11 dl· ♡ 111 dl♡ 1
- 🤗ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-BBBP-101model· 13 dl· ♡ 113 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training · Lib
