Deep Neural Sheaf Diffusion

Remi Bourgerie; Sarunas Girdzijauskas; Viktoria Fodor

arXiv:2605.19021·cs.LG·May 20, 2026

Deep Neural Sheaf Diffusion

Remi Bourgerie, Sarunas Girdzijauskas, Viktoria Fodor

PDF

TL;DR

This paper introduces Deep Neural Sheaf Diffusion (DNSD), a novel deep graph neural network method that overcomes representation collapse in deep layers by replacing the sheaf Laplacian with a sheaf adjacency operator, enabling effective deep aggregation.

Contribution

The paper proposes DNSD, a new deep GNN architecture that maintains informative signals across layers by modifying the sheaf diffusion process and incorporating normalization, nonlinearities, and gating.

Findings

01

DNSD outperforms GNN and NSD baselines with up to 30pp accuracy on synthetic datasets.

02

DNSD consistently outperforms baselines on real-world benchmarks.

03

Theoretical analysis contrasts sheaf diffusion with graph attention mechanisms.

Abstract

Deep Graph Neural Networks (GNNs) are essential for capturing complex dependencies in graph-structured data. However, scaling GNNs to depth remains challenging, as stacking layers leads to representation collapse and diminishing sensitivity due to repeated aggregation. While Neural Sheaf Diffusion (NSD) provides strong theoretical guarantees against such collapse, these guarantees do not translate to practice: as depth increases, the disagreement signal of the sheaf Laplacian vanishes, limiting the contribution of deeper layers. We identify mechanisms that hinder NSD effectiveness at depth and propose \emph{Deep Neural Sheaf Diffusion} (DNSD), which replaces the sheaf Laplacian with a sheaf adjacency operator to maintain informative signals across layers. This is complemented by normalization, odd nonlinearities, and gating. To provide a principled explanation of the expected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.