Motif-based Graph Self-Supervised Learning for Molecular Property Prediction
Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, Chee-Kong Lee

TL;DR
This paper introduces a novel motif-based self-supervised learning framework for GNNs that captures subgraph and motif information in molecular graphs, improving molecular property prediction especially with limited labeled data.
Contribution
The paper proposes a new motif-based pre-training method for GNNs, including a molecule fragmentation technique and a multi-level generative pre-training framework, enhancing molecular property prediction.
Findings
Outperforms state-of-the-art baselines on benchmark tasks.
Effectively captures subgraph and motif information.
Improves performance with limited labeled data.
Abstract
Predicting molecular properties with data-driven methods has drawn much attention in recent years. Particularly, Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks. In cases where labeled data is scarce, GNNs can be pre-trained on unlabeled molecular data to first learn the general semantic and structural information before being fine-tuned for specific tasks. However, most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks. These approaches cannot capture the rich information in subgraphs or graph motifs. For example, functional groups (frequently-occurred subgraphs in molecular graphs) often carry indicative information about the molecular properties. To bridge this gap, we propose Motif-based Graph Self-supervised Learning (MGSSL) by introducing a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Advanced Graph Neural Networks
