Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime
Francesco Camilli, Daria Tieplova, Eleonora Bergamin, Jean Barbier

TL;DR
This paper establishes an information-theoretic equivalence between fully-trained deep neural networks and generalized linear models in the proportional scaling regime, enabling precise calculation of their optimal generalization error.
Contribution
It proves the deep Gaussian equivalence principle, showing neural networks reduce to linear models in the overparametrized proportional regime, and discusses implications for model complexity and data requirements.
Findings
Proves the deep Gaussian equivalence principle.
Calculates the optimal generalization error for deep neural networks.
Shows neural networks trivialize to linear models in the proportional regime.
Abstract
We rigorously analyse fully-trained neural networks of arbitrary depth in the Bayesian optimal setting in the so-called proportional scaling regime where the number of training samples and width of the input and all inner layers diverge proportionally. We prove an information-theoretic equivalence between the Bayesian deep neural network model trained from data generated by a teacher with matching architecture, and a simpler model of optimal inference in a generalized linear model. This equivalence enables us to compute the optimal generalization error for deep neural networks in this regime. We thus prove the "deep Gaussian equivalence principle" conjectured in Cui et al. (2023) (arXiv:2302.00375). Our result highlights that in order to escape this "trivialisation" of deep neural networks (in the sense of reduction to a linear model) happening in the strongly overparametrized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
