A Retrospective Approximation Approach for Smooth Stochastic Optimization
David Newton, Raghu Bollapragada, Raghu Pasupathy, Nung Kwan Yip

TL;DR
This paper introduces Retrospective Approximation (RA), a flexible stochastic optimization method that integrates deterministic solvers within each iteration, improving efficiency and robustness over traditional stochastic gradient approaches.
Contribution
It generalizes stochastic gradient descent by allowing multiple solver steps per iteration, providing a rigorous theoretical foundation and practical termination criteria.
Findings
RA maintains consistency under weak conditions with increasing sample sizes.
RA achieves optimal complexity rates with a practical termination criterion.
Numerical experiments show RA's potential to reduce hyper-parameter tuning.
Abstract
Stochastic Gradient (SG) is the defacto iterative technique to solve stochastic optimization (SO) problems with a smooth (non-convex) objective and a stochastic first-order oracle. SG's attractiveness is due in part to its simplicity of executing a single step along the negative subsampled gradient direction to update the incumbent iterate. In this paper, we question SG's choice of executing a single step as opposed to multiple steps between subsample updates. Our investigation leads naturally to generalizing SG into Retrospective Approximation (RA) where, during each iteration, a "deterministic solver" executes possibly multiple steps on a subsampled deterministic problem and stops when further solving is deemed unnecessary from the standpoint of statistical efficiency. RA thus rigorizes what is appealing for implementation -- during each iteration, "plug in" a solver, e.g., L-BFGS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Risk and Portfolio Optimization · Sparse and Compressive Sensing Techniques
