Learn molecular representations from large-scale unlabeled molecules for drug discovery
Pengyong Li, Jun Wang, Yixuan Qiao, Hao Chen, Yihuan Yu, Xiaojun Yao,, Peng Gao, Guotong Xie, Sen Song

TL;DR
This paper introduces MPG, a novel graph-based pre-training framework for molecular representations that leverages large-scale unlabeled data to improve drug discovery tasks.
Contribution
It presents a new self-supervised pre-training method using MolGNet on 11 million molecules, enhancing molecular representation for various drug discovery applications.
Findings
Pre-trained MolGNet captures valuable chemistry insights.
Achieves state-of-the-art results on 13 benchmark datasets.
Effective for multiple drug discovery tasks.
Abstract
How to produce expressive molecular representations is a fundamental challenge in AI-driven drug discovery. Graph neural network (GNN) has emerged as a powerful technique for modeling molecular data. However, previous supervised approaches usually suffer from the scarcity of labeled data and have poor generalization capability. Here, we proposed a novel Molecular Pre-training Graph-based deep learning framework, named MPG, that leans molecular representations from large-scale unlabeled molecules. In MPG, we proposed a powerful MolGNet model and an effective self-supervised strategy for pre-training the model at both the node and graph-level. After pre-training on 11 million unlabeled molecules, we revealed that MolGNet can capture valuable chemistry insights to produce interpretable representation. The pre-trained MolGNet can be fine-tuned with just one additional output layer to create…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
MethodsGraph Neural Network
