Investigating Graph Neural Networks and Classical Feature-Extraction   Techniques in Activity-Cliff and Molecular Property Prediction

Markus Dablander

arXiv:2411.13688·cs.LG·November 22, 2024

Investigating Graph Neural Networks and Classical Feature-Extraction Techniques in Activity-Cliff and Molecular Property Prediction

Markus Dablander

PDF

TL;DR

This paper compares classical molecular featurisation techniques with graph neural networks for property and activity-cliff prediction, introduces novel models and pooling methods, and discusses future research directions.

Contribution

It provides a comprehensive analysis and comparison of classical and graph-based molecular featurisations, introduces a new twin neural network model, and proposes innovative substructure pooling techniques.

Findings

01

Sort & Slice outperforms hash-based folding in property prediction

02

GNNs and classical methods show complementary strengths

03

Proposed models improve activity-cliff prediction accuracy

Abstract

Molecular featurisation refers to the transformation of molecular data into numerical feature vectors. It is one of the key research areas in molecular machine learning and computational drug discovery. Recently, message-passing graph neural networks (GNNs) have emerged as a novel method to learn differentiable features directly from molecular graphs. While such techniques hold great promise, further investigations are needed to clarify if and when they indeed manage to definitively outcompete classical molecular featurisations such as extended-connectivity fingerprints (ECFPs) and physicochemical-descriptor vectors (PDVs). We systematically explore and further develop classical and graph-based molecular featurisation methods for two important tasks: molecular property prediction, in particular, quantitative structure-activity relationship (QSAR) prediction, and the largely unexplored…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus