Algorithmic Instabilities of Accelerated Gradient Descent
Amit Attia, Tomer Koren

TL;DR
This paper investigates the stability of Nesterov's accelerated gradient method, revealing that its stability deteriorates exponentially with the number of steps, contrasting with previous quadratic case results and non-accelerated methods.
Contribution
It disproves the conjecture that stability grows quadratically in the general convex case and shows exponential deterioration, highlighting fundamental differences in accelerated methods.
Findings
Stability of Nesterov's method deteriorates exponentially with steps.
Contrasts with quadratic case where stability grows quadratically.
Differs from non-accelerated methods with linear stability growth.
Abstract
We study the algorithmic stability of Nesterov's accelerated gradient method. For convex quadratic objectives, Chen et al. (2018) proved that the uniform stability of the method grows quadratically with the number of optimization steps, and conjectured that the same is true for the general convex and smooth case. We disprove this conjecture and show, for two notions of algorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentially fast with the number of gradient steps. This stands in sharp contrast to the bounds in the quadratic case, but also to known results for non-accelerated gradient methods where stability typically grows linearly with the number of steps.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs
