Product-Stability: Provable Convergence for Gradient Descent on the Edge of Stability
Eric Gan

TL;DR
This paper introduces product-stability, a property of loss functions, and proves that gradient descent can converge at the Edge of Stability for a broad class of losses, explaining stable training dynamics.
Contribution
The paper defines product-stability and proves convergence of gradient descent at the Edge of Stability for losses with this property, broadening theoretical understanding.
Findings
Gradient descent converges at EoS for product-stable losses.
Bifurcation diagrams characterize training dynamics and oscillations.
Sharpness at convergence can be precisely quantified.
Abstract
Empirically, modern deep learning training often occurs at the Edge of Stability (EoS), where the sharpness of the loss exceeds the threshold below which classical convergence analysis applies. Despite recent progress, existing theoretical explanations of EoS either rely on restrictive assumptions or focus on specific squared-loss-type objectives. In this work, we introduce and study a structural property of loss functions that we term product-stability. We show that for losses with product-stable minima, gradient descent applied to objectives of the form can provably converge to the local minimum even when training in the EoS regime. This framework substantially generalizes prior results and applies to a broad class of losses, including binary cross entropy. Using bifurcation diagrams, we characterize the resulting training dynamics, explain the emergence of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
