TL;DR
This paper introduces iMolCLR, a novel molecular contrastive learning method that mitigates faulty negatives and leverages fragment-level contrast, significantly enhancing molecular property prediction accuracy.
Contribution
The work proposes two key innovations: considering cheminformatics similarities to reduce faulty negatives and fragment-level contrasting to improve molecular representations.
Findings
Achieved 1.3% average ROC-AUC improvement on classification benchmarks.
Reduced regression error by an average of 4.8% across benchmarks.
Pre-trained models rival or surpass supervised models with complex architectures.
Abstract
Deep learning has been a prevalence in computational chemistry and widely implemented in molecule property predictions. Recently, self-supervised learning (SSL), especially contrastive learning (CL), gathers growing attention for the potential to learn molecular representations that generalize to the gigantic chemical space. Unlike supervised learning, SSL can directly leverage large unlabeled data, which greatly reduces the effort to acquire molecular property labels through costly and time-consuming simulations or experiments. However, most molecular SSL methods borrow the insights from the machine learning community but neglect the unique cheminformatics (e.g., molecular fingerprints) and multi-level graphical structures (e.g., functional groups) of molecules. In this work, we propose iMolCLR: improvement of Molecular Contrastive Learning of Representations with graph neural networks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
