Expressivity and Generalization: Fragment-Biases for Molecular GNNs

Tom Wollschl\"ager; Niklas Kemper; Leon Hetzel; Johanna Sommer and; Stephan G\"unnemann

arXiv:2406.08210·cs.LG·July 26, 2024·3 cites

Expressivity and Generalization: Fragment-Biases for Molecular GNNs

Tom Wollschl\"ager, Niklas Kemper, Leon Hetzel, Johanna Sommer and, Stephan G\"unnemann

PDF

Open Access

TL;DR

This paper introduces a theoretical framework and a new GNN architecture that leverage fragment information to enhance expressiveness and generalization in molecular property prediction, outperforming existing models on multiple datasets.

Contribution

The paper develops the Fragment-WL test for theoretical analysis and proposes a novel GNN with infinite vocabulary fragmentation, improving expressiveness and generalization.

Findings

01

Outperforms all GNNs on Peptides dataset

02

Achieves 12% lower error than GNNs on ZINC

03

Shows superior generalization over transformer-based models

Abstract

Although recent advances in higher-order Graph Neural Networks (GNNs) improve the theoretical expressiveness and molecular property predictive performance, they often fall short of the empirical performance of models that explicitly use fragment information as inductive bias. However, for these approaches, there exists no theoretic expressivity study. In this work, we propose the Fragment-WL test, an extension to the well-known Weisfeiler & Leman (WL) test, which enables the theoretic analysis of these fragment-biased GNNs. Building on the insights gained from the Fragment-WL test, we develop a new GNN architecture and a fragmentation with infinite vocabulary that significantly boosts expressiveness. We show the effectiveness of our model on synthetic and real-world data where we outperform all GNNs on Peptides and have 12% lower error than all GNNs on ZINC and 34% lower error than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods

MethodsFragmentation