Proportional infinite-width infinite-depth limit for deep linear neural networks
Federico Bassetti, Lucia Ladelli, Pietro Rotondo

TL;DR
This paper investigates the limiting behavior of deep linear neural networks as both width and depth grow proportionally, revealing a non-Gaussian distribution that better captures output correlations than traditional Gaussian process limits.
Contribution
It extends previous analyses by rigorously characterizing the joint proportional limit of width and depth in linear neural networks, resulting in a non-Gaussian mixture distribution.
Findings
Limiting distribution is a non-Gaussian mixture of Gaussians.
The joint proportional limit preserves output correlations.
Traditional Gaussian limits are insufficient for dependent feature learning.
Abstract
We study the distributional properties of linear neural networks with random parameters in the context of large networks, where the number of layers diverges in proportion to the number of neurons per layer. Prior works have shown that in the infinite-width regime, where the number of neurons per layer grows to infinity while the depth remains fixed, neural networks converge to a Gaussian process, known as the Neural Network Gaussian Process. However, this Gaussian limit sacrifices descriptive power, as it lacks the ability to learn dependent features and produce output correlations that reflect observed labels. Motivated by these limitations, we explore the joint proportional limit in which both depth and width diverge but maintain a constant ratio, yielding a non-Gaussian distribution that retains correlations between outputs. Our contribution extends previous works by rigorously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Medical Imaging and Analysis
MethodsGaussian Process
