A Dynamical Systems Perspective on Nesterov Acceleration
Michael Muehlebach, Michael I. Jordan

TL;DR
This paper offers a dynamical systems perspective on Nesterov acceleration, deriving it from discretizing an ODE without vanishing step size, and highlights the role of curvature-dependent damping in the acceleration phenomenon.
Contribution
It introduces a novel dynamical systems framework for Nesterov acceleration that does not rely on traditional vanishing step size assumptions.
Findings
Nesterov acceleration arises from discretizing a specific ODE.
A curvature-dependent damping term is central to acceleration.
Connections between continuous and discretized dynamics are established.
Abstract
We present a dynamical system framework for understanding Nesterov's accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods for differential equations · Mechanical and Optical Resonators · Model Reduction and Neural Networks
