Double-descent curves in neural networks: a new perspective using Gaussian processes
Ouns El Harzli, Bernardo Cuenca Grau, Guillermo Valle-P\'erez, Ard, A. Louis

TL;DR
This paper offers a new perspective on the double-descent phenomenon in neural networks by linking spectral properties of empirical feature covariance matrices to Gaussian processes using random matrix theory.
Contribution
It introduces an analytical framework connecting neural network spectral distributions with Gaussian process kernels, explaining double-descent behavior.
Findings
Spectral distribution of empirical covariance matrices is a width-dependent perturbation of NNGP kernel spectrum.
Double-descent is explained by the discrepancy between empirical and NNGP kernels.
Analytical expressions for generalization behavior in kernel and GP regression.
Abstract
Double-descent curves in neural networks describe the phenomenon that the generalisation error initially descends with increasing parameters, then grows after reaching an optimal number of parameters which is less than the number of data points, but then descends again in the overparameterized regime. In this paper, we use techniques from random matrix theory to characterize the spectral distribution of the empirical feature covariance matrix as a width-dependent perturbation of the spectrum of the neural network Gaussian process (NNGP) kernel, thus establishing a novel connection between the NNGP literature and the random matrix theory literature in the context of neural networks. Our analytical expression allows us to study the generalisation behavior of the corresponding kernel and GP regression, and provides a new interpretation of the double-descent phenomenon, namely as governed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Blind Source Separation Techniques
MethodsGaussian Process
