Backward Oversmoothing: why is it hard to train deep Graph Neural Networks?
Nicolas Keriven

TL;DR
This paper investigates the challenge of training deep Graph Neural Networks by analyzing backward oversmoothing, revealing how it causes spurious stationary points and impedes learning.
Contribution
It introduces the concept of backward oversmoothing, demonstrating its impact on the optimization landscape and explaining why deep GNNs are difficult to train.
Findings
Backward oversmoothing affects gradient propagation in GNNs.
Deep GNNs have many spurious stationary points due to backward oversmoothing.
This phenomenon is specific to GNNs and not present in MLPs.
Abstract
Oversmoothing has long been identified as a major limitation of Graph Neural Networks (GNNs): input node features are smoothed at each layer and converge to a non-informative representation, if the weights of the GNN are sufficiently bounded. This assumption is crucial: if, on the contrary, the weights are sufficiently large, then oversmoothing may not happen. Theoretically, GNN could thus learn to not oversmooth. However it does not really happen in practice, which prompts us to examine oversmoothing from an optimization point of view. In this paper, we analyze backward oversmoothing, that is, the notion that backpropagated errors used to compute gradients are also subject to oversmoothing from output to input. With non-linear activation functions, we outline the key role of the interaction between forward and backward smoothing. Moreover, we show that, due to backward oversmoothing,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks
