Stochastic Bias-Reduced Gradient Methods

Hilal Asi; Yair Carmon; Arun Jambulapati; Yujia Jin; Aaron Sidford

arXiv:2106.09481·math.OC·October 29, 2021

Stochastic Bias-Reduced Gradient Methods

Hilal Asi, Yair Carmon, Arun Jambulapati, Yujia Jin, Aaron Sidford

PDF

Open Access 1 Video

TL;DR

This paper introduces a low-bias, low-cost stochastic gradient estimator using multilevel Monte Carlo, enabling more efficient optimization and smoothing techniques with broad applications.

Contribution

It develops a novel estimator for the minimizer of Lipschitz strongly-convex functions, improving stochastic optimization efficiency and enabling dimension-free smoothing.

Findings

01

Achieves bias $oldsymbol{ ext{δ}}$, variance $O( ext{log}(1/δ))$, and cost $O( ext{log}(1/δ))$ in estimation.

02

Improves optimization of the maximum of $N$ functions, matching lower bounds up to logarithmic factors.

03

Enables nearly linear-time, differentially-private non-smooth stochastic optimization.

Abstract

We develop a new primitive for stochastic optimization: a low-bias, low-cost estimator of the minimizer $x_{⋆}$ of any Lipschitz strongly-convex function. In particular, we use a multilevel Monte-Carlo approach due to Blanchet and Glynn to turn any optimal stochastic gradient method into an estimator of $x_{⋆}$ with bias $δ$ , variance $O (lo g (1/ δ))$ , and an expected sampling cost of $O (lo g (1/ δ))$ stochastic gradient evaluations. As an immediate consequence, we obtain cheap and nearly unbiased gradient estimators for the Moreau-Yoshida envelope of any Lipschitz convex function, allowing us to perform dimension-free randomized smoothing. We demonstrate the potential of our estimator through four applications. First, we develop a method for minimizing the maximum of $N$ functions, improving on recent results and matching a lower bound up to logarithmic factors. Second…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stochastic Bias-Reduced Gradient Methods· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data