Error bounds, quadratic growth, and linear convergence of proximal   methods

Dmitriy Drusvyatskiy; Adrian S. Lewis

arXiv:1602.06661·math.OC·June 29, 2016·Math. Oper. Res.

Error bounds, quadratic growth, and linear convergence of proximal methods

Dmitriy Drusvyatskiy, Adrian S. Lewis

PDF

TL;DR

This paper explains why proximal gradient methods often converge linearly without strong convexity by linking error bounds to quadratic growth, and extends the analysis to more general proximal algorithms.

Contribution

It establishes the equivalence between error bounds and quadratic growth conditions, providing a unified explanation for linear convergence in proximal methods.

Findings

01

Error bounds are equivalent to quadratic growth conditions.

02

Proximal methods exhibit linear convergence under these conditions.

03

Short step-lengths indicate near-stationarity, useful for termination.

Abstract

The proximal gradient algorithm for minimizing the sum of a smooth and a nonsmooth convex function often converges linearly even without strong convexity. One common reason is that a multiple of the step length at each iteration may linearly bound the "error" -- the distance to the solution set. We explain the observed linear convergence intuitively by proving the equivalence of such an error bound to a natural quadratic growth condition. Our approach generalizes to linear convergence analysis for proximal methods (of Gauss-Newton type) for minimizing compositions of nonsmooth functions with smooth mappings. We observe incidentally that short step-lengths in the algorithm indicate near-stationarity, suggesting a reliable termination criterion.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.