Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents

Tao Sun; Linbo Qiao; Dongsheng Li

arXiv:1801.07389·math.OC·July 19, 2019

Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents

Tao Sun, Linbo Qiao, Dongsheng Li

PDF

Open Access

TL;DR

This paper analyzes the convergence rates of proximal inertial gradient descent algorithms, establishing non-ergodic rates under various conditions and extending results to multi-block settings with stochastic strategies.

Contribution

It provides novel convergence rate results for proximal inertial gradient descent, including non-ergodic O(1/k) and linear rates under weaker conditions, and extends analysis to multi-block algorithms.

Findings

01

Proved non-ergodic O(1/k) convergence with constant stepsize for coercive functions.

02

Established sublinear rates with diminishing inertial parameters for non-coercive functions.

03

Demonstrated linear convergence under optimal strong convexity with larger stepsizes.

Abstract

The proximal inertial gradient descent is efficient for the composite minimization and applicable for broad of machine learning problems. In this paper, we revisit the computational complexity of this algorithm and present other novel results, especially on the convergence rates of the objective function values. The non-ergodic O(1/k) rate is proved for proximal inertial gradient descent with constant stepzise when the objective function is coercive. When the objective function fails to promise coercivity, we prove the sublinear rate with diminishing inertial parameters. In the case that the objective function satisfies optimal strong convexity condition (which is much weaker than the strong convexity), the linear convergence is proved with much larger and general stepsize than previous literature. We also extend our results to the multi-block version and present the computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems