A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su, Stephen Boyd, Emmanuel J. Candes

TL;DR
This paper derives a second-order differential equation as a limit of Nesterov's accelerated gradient method, providing new insights into its behavior and enabling the development of improved algorithms with proven convergence properties.
Contribution
It introduces a differential equation model for Nesterov's method, offering a novel analytical framework and new algorithms with linear convergence in strongly convex settings.
Findings
The ODE closely approximates Nesterov's scheme.
The ODE-based approach yields algorithms with linear convergence.
Restarting Nesterov's method improves convergence guarantees.
Abstract
We derive a second-order ordinary differential equation (ODE) which is the limit of Nesterov's accelerated gradient method. This ODE exhibits approximate equivalence to Nesterov's scheme and thus can serve as a tool for analysis. We show that the continuous time ODE allows for a better understanding of Nesterov's scheme. As a byproduct, we obtain a family of schemes with similar convergence rates. The ODE interpretation also suggests restarting Nesterov's scheme leading to an algorithm, which can be rigorously proven to converge at a linear rate whenever the objective is strongly convex.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
