Short-Range Oversquashing
Yaaqov Mishayev, Yonatan Sverdlov, Tal Amir, Nadav Dym

TL;DR
This paper investigates oversquashing in message passing neural networks, revealing it occurs even in short-range tasks and can be mitigated by transformers rather than traditional rewiring techniques, highlighting the limitations of current explanations.
Contribution
The study uncovers that oversquashing is not solely a long-range issue and introduces a new understanding of its mechanisms, favoring transformers over MPNNs for addressing oversquashing.
Findings
Oversquashing occurs in short-range problems, not just long-range.
Virtual nodes do not resolve short-range oversquashing.
Transformers outperform MPNNs in mitigating oversquashing.
Abstract
Message Passing Neural Networks (MPNNs) are widely used for learning on graphs, but their ability to process long-range information is limited by the phenomenon of oversquashing. This limitation has led some researchers to advocate Graph Transformers as a better alternative, whereas others suggest that it can be mitigated within the MPNN framework, using virtual nodes or other rewiring techniques. In this work, we demonstrate that oversquashing is not limited to long-range tasks, but can also arise in short-range problems. This observation allows us to disentangle two distinct mechanisms underlying oversquashing: (1) the bottleneck phenomenon, which can arise even in low-range settings, and (2) the vanishing gradient phenomenon, which is closely associated with long-range tasks. We further show that the short-range bottleneck effect is not captured by existing explanations for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
