Depth Degeneracy in Neural Networks: Vanishing Angles in Fully Connected ReLU Networks on Initialization
Cameron Jakub, Mihai Nica

TL;DR
This paper investigates the phenomenon of depth degeneracy in neural networks, revealing how the angle between inputs shrinks rapidly with depth, which can hinder training, supported by theoretical formulas and empirical validation.
Contribution
It provides explicit formulas for the angle decay in deep ReLU networks, capturing microscopic fluctuations beyond infinite width limits, and links these to combinatorial structures like Bessel numbers.
Findings
Angles between inputs approach zero faster as depth increases.
Formulas accurately predict finite network behavior.
Depth degeneracy can negatively affect training performance.
Abstract
Despite remarkable performance on a variety of tasks, many properties of deep neural networks are not yet theoretically understood. One such mystery is the depth degeneracy phenomenon: the deeper you make your network, the closer your network is to a constant function on initialization. In this paper, we examine the evolution of the angle between two inputs to a ReLU neural network as a function of the number of layers. By using combinatorial expansions, we find precise formulas for how fast this angle goes to zero as depth increases. These formulas capture microscopic fluctuations that are not visible in the popular framework of infinite width limits, and leads to qualitatively different predictions. We validate our theoretical results with Monte Carlo experiments and show that our results accurately approximate finite network behaviour. \review{We also empirically investigate how the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Machine Learning in Materials Science · Stochastic Gradient Optimization Techniques
