Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory
Minhak Song, Chulhee Yun

TL;DR
This paper investigates the Edge of Stability phenomenon in gradient descent, demonstrating trajectory alignment through bifurcation theory and providing rigorous proofs for simplified neural network models.
Contribution
It introduces a novel bifurcation theory perspective to understand the EoS and proves trajectory alignment in simplified neural network models.
Findings
Trajectory alignment occurs across different GD trajectories after reparameterization.
The EoS phenomenon and progressive sharpening are explained by bifurcation analysis.
Theoretical proofs are provided for linear and single-neuron nonlinear networks.
Abstract
Cohen et al. (2021) empirically study the evolution of the largest eigenvalue of the loss Hessian, also known as sharpness, along the gradient descent (GD) trajectory and observe the Edge of Stability (EoS) phenomenon. The sharpness increases at the early phase of training (referred to as progressive sharpening), and eventually saturates close to the threshold of . In this paper, we start by demonstrating through empirical studies that when the EoS phenomenon occurs, different GD trajectories (after a proper reparameterization) align on a specific bifurcation diagram independent of initialization. We then rigorously prove this trajectory alignment phenomenon for a two-layer fully-connected linear network and a single-neuron nonlinear network trained with a single data point. Our trajectory alignment analysis establishes both progressive sharpening and EoS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topicsstochastic dynamics and bifurcation · Force Microscopy Techniques and Applications · Neural dynamics and brain function
MethodsALIGN
