Molecular Topological Profile (MOLTOP) -- Simple and Strong Baseline for Molecular Graph Classification
Jakub Adamczyk, Wojciech Czech

TL;DR
This paper introduces MOLTOP, a simple, fast, and hyperparameter-free topological descriptor-based method for molecular graph classification that rivals modern GNNs across multiple benchmarks.
Contribution
MOLTOP is a novel, effective baseline combining topological descriptors with a Random Forest, demonstrating competitive performance against advanced GNNs.
Findings
MOLTOP surpasses the 1-WL test in several classes.
It performs comparably to or better than some GNNs.
The method is fast, low-variance, and hyperparameter-free.
Abstract
We revisit the effectiveness of topological descriptors for molecular graph classification and design a simple, yet strong baseline. We demonstrate that a simple approach to feature engineering - employing histogram aggregation of edge descriptors and one-hot encoding for atomic numbers and bond types - when combined with a Random Forest classifier, can establish a strong baseline for Graph Neural Networks (GNNs). The novel algorithm, Molecular Topological Profile (MOLTOP), integrates Edge Betweenness Centrality, Adjusted Rand Index and SCAN Structural Similarity score. This approach proves to be remarkably competitive when compared to modern GNNs, while also being simple, fast, low-variance and hyperparameter-free. Our approach is rigorously tested on MoleculeNet datasets using fair evaluation protocol provided by Open Graph Benchmark. We additionally show out-of-domain generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Various Chemistry Research Topics · Machine Learning in Bioinformatics
