Understanding Nesterov's Acceleration via Proximal Point Method
Kwangjun Ahn, Suvrit Sra

TL;DR
This paper uses the proximal point method to provide simple derivations and convergence analyses of Nesterov's accelerated gradient method, offering a unified and conceptual understanding of its variants and extensions.
Contribution
It presents a novel perspective viewing AGM as an approximation of PPM, simplifying derivations and analyses, and extending results to strongly convex cases.
Findings
Elementary derivation of AGM update equations
Unified convergence analysis for AGM variants
Extension of results to strongly convex optimization
Abstract
The proximal point method (PPM) is a fundamental method in optimization that is often used as a building block for designing optimization algorithms. In this work, we use the PPM method to provide conceptually simple derivations along with convergence analyses of different versions of Nesterov's accelerated gradient method (AGM). The key observation is that AGM is a simple approximation of PPM, which results in an elementary derivation of the update equations and stepsizes of AGM. This view also leads to a transparent and conceptually simple analysis of AGM's convergence by using the analysis of PPM. The derivations also naturally extend to the strongly convex case. Ultimately, the results presented in this paper are of both didactic and conceptual value; they unify and explain existing variants of AGM while motivating other accelerated methods for practically relevant settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques
