Understanding Nesterov's Acceleration via Proximal Point Method

Kwangjun Ahn; Suvrit Sra

arXiv:2005.08304·math.OC·June 3, 2022

Understanding Nesterov's Acceleration via Proximal Point Method

Kwangjun Ahn, Suvrit Sra

PDF

Open Access

TL;DR

This paper uses the proximal point method to provide simple derivations and convergence analyses of Nesterov's accelerated gradient method, offering a unified and conceptual understanding of its variants and extensions.

Contribution

It presents a novel perspective viewing AGM as an approximation of PPM, simplifying derivations and analyses, and extending results to strongly convex cases.

Findings

01

Elementary derivation of AGM update equations

02

Unified convergence analysis for AGM variants

03

Extension of results to strongly convex optimization

Abstract

The proximal point method (PPM) is a fundamental method in optimization that is often used as a building block for designing optimization algorithms. In this work, we use the PPM method to provide conceptually simple derivations along with convergence analyses of different versions of Nesterov's accelerated gradient method (AGM). The key observation is that AGM is a simple approximation of PPM, which results in an elementary derivation of the update equations and stepsizes of AGM. This view also leads to a transparent and conceptually simple analysis of AGM's convergence by using the analysis of PPM. The derivations also naturally extend to the strongly convex case. Ultimately, the results presented in this paper are of both didactic and conceptual value; they unify and explain existing variants of AGM while motivating other accelerated methods for practically relevant settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques