Rethinking the Variational Interpretation of Nesterov's Accelerated Method
Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand

TL;DR
This paper critically reexamines the variational interpretation of Nesterov's accelerated method, revealing it as a saddle point rather than a minimizer in the Bregman Lagrangian framework, challenging existing theoretical assumptions.
Contribution
It provides an in-depth variational analysis showing Nesterov's method is a saddle point, not a minimizer, in the Bregman Lagrangian setting, offering new insights into its geometric properties.
Findings
Nesterov's method is a saddle point in the Bregman Lagrangian framework.
Challenges the assumption that Nesterov's method minimizes the action.
Provides new geometric insights into accelerated optimization paths.
Abstract
The continuous-time model of Nesterov's momentum provides a thought-provoking perspective for understanding the nature of the acceleration phenomenon in convex optimization. One of the main ideas in this line of research comes from the field of classical mechanics and proposes to link Nesterov's trajectory to the solution of a set of Euler-Lagrange equations relative to the so-called Bregman Lagrangian. In the last years, this approach led to the discovery of many new (stochastic) accelerated algorithms and provided a solid theoretical foundation for the design of structure-preserving accelerated methods. In this work, we revisit this idea and provide an in-depth analysis of the action relative to the Bregman Lagrangian from the point of view of calculus of variations. Our main finding is that, while Nesterov's method is a stationary point for the action, it is often not a minimizer but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Topological and Geometric Data Analysis
