Investigating Graph Neural Networks and Classical Feature-Extraction Techniques in Activity-Cliff and Molecular Property Prediction
Markus Dablander

TL;DR
This paper compares classical molecular featurisation techniques with graph neural networks for property and activity-cliff prediction, introduces novel models and pooling methods, and discusses future research directions.
Contribution
It provides a comprehensive analysis and comparison of classical and graph-based molecular featurisations, introduces a new twin neural network model, and proposes innovative substructure pooling techniques.
Findings
Sort & Slice outperforms hash-based folding in property prediction
GNNs and classical methods show complementary strengths
Proposed models improve activity-cliff prediction accuracy
Abstract
Molecular featurisation refers to the transformation of molecular data into numerical feature vectors. It is one of the key research areas in molecular machine learning and computational drug discovery. Recently, message-passing graph neural networks (GNNs) have emerged as a novel method to learn differentiable features directly from molecular graphs. While such techniques hold great promise, further investigations are needed to clarify if and when they indeed manage to definitively outcompete classical molecular featurisations such as extended-connectivity fingerprints (ECFPs) and physicochemical-descriptor vectors (PDVs). We systematically explore and further develop classical and graph-based molecular featurisation methods for two important tasks: molecular property prediction, in particular, quantitative structure-activity relationship (QSAR) prediction, and the largely unexplored…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
