Motif-based Graph Self-Supervised Learning for Molecular Property   Prediction

Zaixi Zhang; Qi Liu; Hao Wang; Chengqiang Lu; Chee-Kong Lee

arXiv:2110.00987·q-bio.QM·October 19, 2021·44 cites

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction

Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, Chee-Kong Lee

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel motif-based self-supervised learning framework for GNNs that captures subgraph and motif information in molecular graphs, improving molecular property prediction especially with limited labeled data.

Contribution

The paper proposes a new motif-based pre-training method for GNNs, including a molecule fragmentation technique and a multi-level generative pre-training framework, enhancing molecular property prediction.

Findings

01

Outperforms state-of-the-art baselines on benchmark tasks.

02

Effectively captures subgraph and motif information.

03

Improves performance with limited labeled data.

Abstract

Predicting molecular properties with data-driven methods has drawn much attention in recent years. Particularly, Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks. In cases where labeled data is scarce, GNNs can be pre-trained on unlabeled molecular data to first learn the general semantic and structural information before being fine-tuned for specific tasks. However, most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks. These approaches cannot capture the rich information in subgraphs or graph motifs. For example, functional groups (frequently-occurred subgraphs in molecular graphs) often carry indicative information about the molecular properties. To bridge this gap, we propose Motif-based Graph Self-supervised Learning (MGSSL) by introducing a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zaixizhang/MGSSL
pytorchOfficial

Videos

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction· slideslive

Taxonomy

TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Advanced Graph Neural Networks