Graph Representational Learning: When Does More Expressivity Hurt Generalization?

Sohir Maskey; Raffaele Paolino; Fabian Jogl; Gitta Kutyniok; Johannes F. Lutzeyer

arXiv:2505.11298·cs.LG·May 19, 2025

Graph Representational Learning: When Does More Expressivity Hurt Generalization?

Sohir Maskey, Raffaele Paolino, Fabian Jogl, Gitta Kutyniok, Johannes F. Lutzeyer

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper investigates how the expressivity of Graph Neural Networks affects their ability to generalize, revealing that higher expressivity can hurt performance unless balanced by training data size or graph similarity.

Contribution

It introduces a family of graph similarity metrics and derives generalization bounds linking expressivity, data size, and graph similarity.

Findings

01

More expressive GNNs may generalize worse without sufficient data.

02

Generalization bounds depend on graph similarity and model complexity.

03

Empirical results support the theoretical insights.

Abstract

Graph Neural Networks (GNNs) are powerful tools for learning on structured data, yet the relationship between their expressivity and predictive performance remains unclear. We introduce a family of premetrics that capture different degrees of structural similarity between graphs and relate these similarities to generalization, and consequently, the performance of expressive GNNs. By considering a setting where graph labels are correlated with structural features, we derive generalization bounds that depend on the distance between training and test graphs, model complexity, and training set size. These bounds reveal that more expressive GNNs may generalize worse unless their increased complexity is balanced by a sufficiently large training set or reduced distance between training and test graphs. Our findings relate expressivity and generalization, offering theoretical insights supported…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

- The idea of task-aligned pseudometrics for graphs, formalized as ζ-TMDs, is novel and compelling. Extending TMD through strong simulation of CRAs unifies diverse expressive GNNs (k-GNNs, subgraph models, motif-augmented MPNNs) under a single stability/robustness lens. - Proofs are extensive and careful, showing Lipschitz continuity of ζ-MPNNs and deriving PAC-Bayes style bounds where the generalization gap depends on $\xi_\zeta$. The synthetic and real-world experiments are well targeted to t

Weaknesses

- Identifying the right ζ (task-aligned pseudometric) can be hard: The framework presumes labels are “strongly correlated” with a chosen $\zeta-TMD$. In practice, discovering the right ζ is nontrivial and can be expensive (e.g., motif counts, product graphs). - The paper assumes the CRA to be strongly simulatable, but did not mention under what condition is a CRA strongly simulatable, some examples would be good. - Bound looseness and constants: The PAC-Bayes bound involves many constants (wi

Reviewer 02Rating 4Confidence 3

Strengths

Clear metric → stability → generalization pipeline with non-trivial derivations. he paper defines task-aligned pseudometrics ζ-TMDs (Def. 3.1; Prop. 3.2–3.3), proves Lipschitz stability of ζ-MPNNs with respect to ζ-TMD (Theorem 3.4; e.g., “∥h(G) − h(H)∥ ≤ L · ζ-TMDT+1(G, H)”), and derives PAC-Bayes generalization bounds that separate a capacity term from a structural-similarity term (Theorem 4.1 and its simplified Eq. (4)). These steps connect distance between graphs → Lipschitz robustness → gen

Weaknesses

Choosing ζ is nontrivial and can be the bottleneck; once ζ is fixed, several proofs become mechanical. The main benefit hinges on picking a ζ that actually aligns with labels. The authors acknowledge this: “the relevant pseudometric is often unknown and possibly expensive to compute” (Limitations), and Rζ can inflate size/degree/features (Eq. (4) discussion: “model complexity may increase if the graph transformation Rζ enlarges…”). Since many results follow from strong simulation plus known Lip

Reviewer 03Rating 4Confidence 3

Strengths

- The paper tackles a central open problem in graph learning: understanding when increased GNN expressivity improves versus harms generalization. This is a core theoretical theme highlighted in recent GNN survey and workshop discussions, and the paper makes progress on a topic with clear community interest. - The ζ-TMD construction formalizes the intuition that model expressivity should match task relevant structural signals. By connecting GNN expressivity to pseudometrics that capture structura

Weaknesses

- Although the paper proposes a ζ-TMD framework, the incremental contribution over closely related analyses of structure label alignment and robustness in recent literature (for example Ma et al., 2021; Li et al., 2024; Vasileiou et al., 2024) is not entirely explicit. The manuscript would benefit from a sharper articulation of what new theoretical insight is enabled beyond these frameworks, particularly given that alignment based perspectives have been explored previously. - The approach assume

Code & Models

Repositories

rpaolino/genvsexp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Explainable Artificial Intelligence (XAI) · Graph Theory and Algorithms

MethodsSparse Evolutionary Training