Multi-level Monte-Carlo Gradient Methods for Stochastic Optimization   with Biased Oracles

Yifan Hu; Jie Wang; Xin Chen; Niao He

arXiv:2408.11084·math.OC·August 22, 2024

Multi-level Monte-Carlo Gradient Methods for Stochastic Optimization with Biased Oracles

Yifan Hu, Jie Wang, Xin Chen, Niao He

PDF

Open Access

TL;DR

This paper introduces multi-level Monte Carlo gradient methods for stochastic optimization with biased oracles, achieving lower complexity and better performance than traditional biased stochastic gradient methods across various problem types.

Contribution

The paper develops and analyzes MLMC gradient methods that effectively balance bias, variance, and cost, demonstrating their advantages over existing biased stochastic gradient approaches.

Findings

01

MLMC methods outperform standard biased stochastic gradient methods.

02

Combining MLMC with variance reduction techniques further reduces complexity.

03

Numerical experiments confirm superior performance in practical applications.

Abstract

We consider stochastic optimization when one only has access to biased stochastic oracles of the objective and the gradient, and obtaining stochastic gradients with low biases comes at high costs. This setting captures various optimization paradigms, such as conditional stochastic optimization, distributionally robust optimization, shortfall risk optimization, and machine learning paradigms, such as contrastive learning. We examine a family of multi-level Monte Carlo (MLMC) gradient methods that exploit a delicate tradeoff among bias, variance, and oracle cost. We systematically study their total sample and computational complexities for strongly convex, convex, and nonconvex objectives and demonstrate their superiority over the widely used biased stochastic gradient method. When combined with the variance reduction techniques like SPIDER, these MLMC gradient methods can further reduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques

MethodsContrastive Learning