Phase Transitions for Feature Learning in Neural Networks
Andrea Montanari, Zihao Wang

TL;DR
This paper investigates the phase transition thresholds in neural network feature learning, analyzing how the network's ability to learn low-dimensional representations depends on data and architecture parameters.
Contribution
It derives a new threshold $oldsymbol{ ext{delta}}_{ ext{NN}}$ for two-layer neural networks, extending previous theoretical results to understand learning dynamics.
Findings
Identifies a phase transition in the Hessian spectrum during training.
Derives a threshold $oldsymbol{ ext{delta}}_{ ext{NN}}$ for neural network feature learning.
Provides insights into how network architecture influences learning dynamics.
Abstract
According to a popular viewpoint, neural networks learn from data by first identifying low-dimensional representations, and subsequently fitting the best model in this space. Recent works provide a formalization of this phenomenon when learning multi-index models. In this setting, we are given i.i.d. pairs , where the covariate vectors are isotropic, and responses only depend on through a -dimensional projection . Feature learning amounts to learning the latent space spanned by . In this context, we study the gradient descent dynamics of two-layer neural networks under the proportional asymptotics , , while the dimension of the latent space and the number of hidden neurons are kept…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Quantum many-body systems
