Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents
Tao Sun, Linbo Qiao, Dongsheng Li

TL;DR
This paper analyzes the convergence rates of proximal inertial gradient descent algorithms, establishing non-ergodic rates under various conditions and extending results to multi-block settings with stochastic strategies.
Contribution
It provides novel convergence rate results for proximal inertial gradient descent, including non-ergodic O(1/k) and linear rates under weaker conditions, and extends analysis to multi-block algorithms.
Findings
Proved non-ergodic O(1/k) convergence with constant stepsize for coercive functions.
Established sublinear rates with diminishing inertial parameters for non-coercive functions.
Demonstrated linear convergence under optimal strong convexity with larger stepsizes.
Abstract
The proximal inertial gradient descent is efficient for the composite minimization and applicable for broad of machine learning problems. In this paper, we revisit the computational complexity of this algorithm and present other novel results, especially on the convergence rates of the objective function values. The non-ergodic O(1/k) rate is proved for proximal inertial gradient descent with constant stepzise when the objective function is coercive. When the objective function fails to promise coercivity, we prove the sublinear rate with diminishing inertial parameters. In the case that the objective function satisfies optimal strong convexity condition (which is much weaker than the strong convexity), the linear convergence is proved with much larger and general stepsize than previous literature. We also extend our results to the multi-block version and present the computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems
