Stochastic Bias-Reduced Gradient Methods
Hilal Asi, Yair Carmon, Arun Jambulapati, Yujia Jin, Aaron Sidford

TL;DR
This paper introduces a low-bias, low-cost stochastic gradient estimator using multilevel Monte Carlo, enabling more efficient optimization and smoothing techniques with broad applications.
Contribution
It develops a novel estimator for the minimizer of Lipschitz strongly-convex functions, improving stochastic optimization efficiency and enabling dimension-free smoothing.
Findings
Achieves bias $oldsymbol{ ext{δ}}$, variance $O( ext{log}(1/δ))$, and cost $O( ext{log}(1/δ))$ in estimation.
Improves optimization of the maximum of $N$ functions, matching lower bounds up to logarithmic factors.
Enables nearly linear-time, differentially-private non-smooth stochastic optimization.
Abstract
We develop a new primitive for stochastic optimization: a low-bias, low-cost estimator of the minimizer of any Lipschitz strongly-convex function. In particular, we use a multilevel Monte-Carlo approach due to Blanchet and Glynn to turn any optimal stochastic gradient method into an estimator of with bias , variance , and an expected sampling cost of stochastic gradient evaluations. As an immediate consequence, we obtain cheap and nearly unbiased gradient estimators for the Moreau-Yoshida envelope of any Lipschitz convex function, allowing us to perform dimension-free randomized smoothing. We demonstrate the potential of our estimator through four applications. First, we develop a method for minimizing the maximum of functions, improving on recent results and matching a lower bound up to logarithmic factors. Second…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
