Improving Self-supervised Molecular Representation Learning using   Persistent Homology

Yuankai Luo; Lei Shi; Veronika Thost

arXiv:2311.17327·cs.LG·November 30, 2023·2 cites

Improving Self-supervised Molecular Representation Learning using Persistent Homology

Yuankai Luo, Lei Shi, Veronika Thost

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel self-supervised learning method for molecular representations using persistent homology, demonstrating improved predictive power especially on small datasets through a new contrastive loss.

Contribution

It proposes a new SSL approach based on persistent homology, including an autoencoder and a contrastive loss, enhancing molecular property prediction performance.

Findings

01

Representations are more predictive after SSL.

02

The contrastive loss improves baseline performance.

03

Significant gains on small datasets.

Abstract

Self-supervised learning (SSL) has great potential for molecular representation learning given the complexity of molecular graphs, the large amounts of unlabelled data available, the considerable cost of obtaining labels experimentally, and the hence often only small training datasets. The importance of the topic is reflected in the variety of paradigms and architectures that have been investigated recently. Yet the differences in performance seem often minor and are barely understood to date. In this paper, we study SSL based on persistent homology (PH), a mathematical tool for modeling topological features of data that persist across multiple scales. It has several unique features which particularly suit SSL, naturally offering: different views of the data, stability in terms of distance preservation, and the opportunity to flexibly incorporate domain knowledge. We (1) investigate an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

luoyk1999/molecular-homology
pytorchOfficial

Videos

Improving Self-supervised Molecular Representation Learning using Persistent Homology· slideslive

Taxonomy

TopicsTopological and Geometric Data Analysis · Bioinformatics and Genomic Networks · Computational Drug Discovery Methods