On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning

\'Alvaro Arroyo; Alessio Gravina; Benjamin Gutteridge; Federico Barbero; Claudio Gallicchio; Xiaowen Dong; Michael Bronstein; Pierre Vandergheynst

arXiv:2502.10818·cs.LG·October 28, 2025

On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning

\'Alvaro Arroyo, Alessio Gravina, Benjamin Gutteridge, Federico Barbero, Claudio Gallicchio, Xiaowen Dong, Michael Bronstein, Pierre Vandergheynst

PDF

Open Access

TL;DR

This paper analyzes the issues of over-smoothing and over-squashing in GNNs through the lens of vanishing gradients, proposing a state-space formulation that mitigates these problems without extra parameters.

Contribution

It offers a unified view of GNN problems via control theory, interprets GNNs as recurrent models, and introduces a simple formulation that alleviates over-smoothing and over-squashing.

Findings

01

GNNs are prone to extreme gradient vanishing after few layers.

02

Over-smoothing is linked to vanishing gradients.

03

Graph rewiring combined with gradient mitigation alleviates over-squashing.

Abstract

Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes, typically through the message-passing operation. While widely successful, this approach is well known to suffer from the over-smoothing and over-squashing phenomena, which result in representational collapse as the number of layers increases and insensitivity to the information contained at distant and poorly connected nodes, respectively. In this paper, we present a unified view of these problems through the lens of vanishing gradients, using ideas from linear control theory for our analysis. We propose an interpretation of GNNs as recurrent models and empirically demonstrate that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost. Further, we show theoretically and empirically that (i) GNNs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Brain Tumor Detection and Classification · Machine Learning and ELM

MethodsGraph Neural Network