Phase Transitions for Feature Learning in Neural Networks

Andrea Montanari; Zihao Wang

arXiv:2602.01434·cs.LG·February 27, 2026

Phase Transitions for Feature Learning in Neural Networks

Andrea Montanari, Zihao Wang

PDF

Open Access

TL;DR

This paper investigates the phase transition thresholds in neural network feature learning, analyzing how the network's ability to learn low-dimensional representations depends on data and architecture parameters.

Contribution

It derives a new threshold $oldsymbol{ ext{delta}}_{ ext{NN}}$ for two-layer neural networks, extending previous theoretical results to understand learning dynamics.

Findings

01

Identifies a phase transition in the Hessian spectrum during training.

02

Derives a threshold $oldsymbol{ ext{delta}}_{ ext{NN}}$ for neural network feature learning.

03

Provides insights into how network architecture influences learning dynamics.

Abstract

According to a popular viewpoint, neural networks learn from data by first identifying low-dimensional representations, and subsequently fitting the best model in this space. Recent works provide a formalization of this phenomenon when learning multi-index models. In this setting, we are given $n$ i.i.d. pairs $(x_{i}, y_{i})$ , where the covariate vectors $x_{i} \in R^{d}$ are isotropic, and responses $y_{i}$ only depend on $x_{i}$ through a $k$ -dimensional projection $Θ_{*}^{T} x_{i}$ . Feature learning amounts to learning the latent space spanned by $Θ_{*}$ . In this context, we study the gradient descent dynamics of two-layer neural networks under the proportional asymptotics $n, d \to \infty$ , $n / d \to δ$ , while the dimension of the latent space $k$ and the number of hidden neurons $m$ are kept…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Quantum many-body systems