On the Convergence and Complexity of Proximal Gradient and Accelerated Proximal Gradient Methods under Adaptive Gradient Estimation
Raghu Bollapragada, Shagun Gupta

TL;DR
This paper introduces adaptive gradient estimation techniques within proximal gradient methods for composite optimization, achieving optimal convergence rates even with biased or stochastic gradient estimates across various convexity settings.
Contribution
It develops new proximal algorithms that adaptively refine gradient estimates, ensuring optimal complexity and convergence for nonconvex, convex, and strongly convex problems.
Findings
Achieves optimal iteration complexity for first-order methods.
Demonstrates efficiency with biased and unbiased gradient estimates.
Validates theoretical results through numerical experiments.
Abstract
In this paper, we propose a proximal gradient method and an accelerated proximal gradient method for solving composite optimization problems, where the objective function is the sum of a smooth and a convex, possibly nonsmooth, function. We consider settings where the smooth component is either a finite-sum function or an expectation of a stochastic function, making it computationally expensive or impractical to evaluate its gradient. To address this, we utilize gradient estimates within the proximal gradient framework. Our methods dynamically adjust the accuracy of these estimates, increasing it as the iterates approach a solution, thereby enabling high-precision solutions with minimal computational cost. We analyze the methods when the smooth component is nonconvex, convex, or strongly convex, using a biased gradient estimate. In all cases, the methods achieve the optimal iteration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
