Expressivity and Generalization: Fragment-Biases for Molecular GNNs
Tom Wollschl\"ager, Niklas Kemper, Leon Hetzel, Johanna Sommer and, Stephan G\"unnemann

TL;DR
This paper introduces a theoretical framework and a new GNN architecture that leverage fragment information to enhance expressiveness and generalization in molecular property prediction, outperforming existing models on multiple datasets.
Contribution
The paper develops the Fragment-WL test for theoretical analysis and proposes a novel GNN with infinite vocabulary fragmentation, improving expressiveness and generalization.
Findings
Outperforms all GNNs on Peptides dataset
Achieves 12% lower error than GNNs on ZINC
Shows superior generalization over transformer-based models
Abstract
Although recent advances in higher-order Graph Neural Networks (GNNs) improve the theoretical expressiveness and molecular property predictive performance, they often fall short of the empirical performance of models that explicitly use fragment information as inductive bias. However, for these approaches, there exists no theoretic expressivity study. In this work, we propose the Fragment-WL test, an extension to the well-known Weisfeiler & Leman (WL) test, which enables the theoretic analysis of these fragment-biased GNNs. Building on the insights gained from the Fragment-WL test, we develop a new GNN architecture and a fragmentation with infinite vocabulary that significantly boosts expressiveness. We show the effectiveness of our model on synthetic and real-world data where we outperform all GNNs on Peptides and have 12% lower error than all GNNs on ZINC and 34% lower error than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods
MethodsFragmentation
