A Random Matrix Theory Perspective on the Spectrum of Learned Features   and Asymptotic Generalization Capabilities

Yatin Dandi; Luca Pesce; Hugo Cui; Florent Krzakala; Yue M. Lu; and; Bruno Loureiro

arXiv:2410.18938·stat.ML·October 25, 2024

A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities

Yatin Dandi, Luca Pesce, Hugo Cui, Florent Krzakala, Yue M. Lu, and, Bruno Loureiro

PDF

Open Access

TL;DR

This paper uses random matrix theory to analyze how two-layer neural networks adapt their features during training, providing a rigorous understanding of their spectrum evolution and generalization capabilities in the large batch limit.

Contribution

It establishes a rigorous equivalence between trained features and a spiked random feature model, deriving deterministic equivalents for the feature covariance and exact asymptotic generalization error.

Findings

01

Characterizes the impact of training on feature spectrum tails.

02

Provides a deterministic equivalent for the feature covariance matrix.

03

Derives the exact asymptotic generalization error.

Abstract

A key property of neural networks is their capacity of adapting to data during training. Yet, our current mathematical understanding of feature learning and its relationship to generalization remain limited. In this work, we provide a random matrix analysis of how fully-connected two-layer neural networks adapt to the target function after a single, but aggressive, gradient descent step. We rigorously establish the equivalence between the updated features and an isotropic spiked random feature model, in the limit of large batch size. For the latter model, we derive a deterministic equivalent description of the feature empirical covariance matrix in terms of certain low-dimensional operators. This allows us to sharply characterize the impact of training in the asymptotic feature spectrum, and in particular, provides a theoretical grounding for how the tails of the feature spectrum modify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Morphological variations and asymmetry