Criteria and Bias of Parameterized Linear Regression under Edge of Stability Regime
Peiyuan Zhang, Amin Karbasi

TL;DR
This paper investigates the Edge of Stability phenomenon in gradient descent, revealing that EoS can occur even with quadratic loss functions and exploring its implications for diagonal linear networks.
Contribution
It challenges the belief that subquadratic loss is necessary for EoS, showing EoS also occurs with quadratic loss and analyzing implicit bias in diagonal linear networks.
Findings
EoS occurs with quadratic loss under certain conditions.
GD converges to a linear interpolator non-asymptotically.
Diagonal linear networks exhibit implicit bias under large step-sizes.
Abstract
Classical optimization theory requires a small step-size for gradient-based methods to converge. Nevertheless, recent findings challenge the traditional idea by empirically demonstrating Gradient Descent (GD) converges even when the step-size exceeds the threshold of , where is the global smooth constant. This is usually known as the Edge of Stability (EoS) phenomenon. A widely held belief suggests that an objective function with subquadratic growth plays an important role in incurring EoS. In this paper, we provide a more comprehensive answer by considering the task of finding linear interpolator for regression with loss function , where admits parameterization as . Contrary to the previous work that suggests a subquadratic is necessary for EoS, our novel finding reveals that EoS occurs even when is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Scientific Research Methods · Advanced Statistical Methods and Models · Statistical and Computational Modeling
