Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu, Vladimir Braverman, Jason D. Lee

TL;DR
This paper analyzes how gradient descent behaves at the edge of stability when training logistic regression models, showing convergence properties, implicit bias, and differences from exponential loss in this regime.
Contribution
It provides theoretical insights into the convergence and implicit bias of constant-stepsize gradient descent for logistic regression at the edge of stability, including divergence conditions.
Findings
GD minimizes logistic loss despite local oscillations
GD iterates tend to infinity in max-margin directions
Exponential loss may diverge catastrophically at EoS
Abstract
Recent research has observed that in machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS) [Cohen, et al., 2021], where the stepsizes are set to be large, resulting in non-monotonic losses induced by the GD iterates. This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime. Despite the presence of local oscillations, we prove that the logistic loss can be minimized by GD with \emph{any} constant stepsize over a long time scale. Furthermore, we prove that with \emph{any} constant stepsize, the GD iterates tend to infinity when projected to a max-margin direction (the hard-margin SVM direction) and converge to a fixed vector that minimizes a strongly convex potential when projected to the orthogonal complement of the max-margin direction. In contrast, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Optical Imaging and Spectroscopy Techniques
MethodsLogistic Regression · Support Vector Machine
