Error bounds, quadratic growth, and linear convergence of proximal methods
Dmitriy Drusvyatskiy, Adrian S. Lewis

TL;DR
This paper explains why proximal gradient methods often converge linearly without strong convexity by linking error bounds to quadratic growth, and extends the analysis to more general proximal algorithms.
Contribution
It establishes the equivalence between error bounds and quadratic growth conditions, providing a unified explanation for linear convergence in proximal methods.
Findings
Error bounds are equivalent to quadratic growth conditions.
Proximal methods exhibit linear convergence under these conditions.
Short step-lengths indicate near-stationarity, useful for termination.
Abstract
The proximal gradient algorithm for minimizing the sum of a smooth and a nonsmooth convex function often converges linearly even without strong convexity. One common reason is that a multiple of the step length at each iteration may linearly bound the "error" -- the distance to the solution set. We explain the observed linear convergence intuitively by proving the equivalence of such an error bound to a natural quadratic growth condition. Our approach generalizes to linear convergence analysis for proximal methods (of Gauss-Newton type) for minimizing compositions of nonsmooth functions with smooth mappings. We observe incidentally that short step-lengths in the algorithm indicate near-stationarity, suggesting a reliable termination criterion.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
