Stochastic model-based minimization of weakly convex functions

Damek Davis; Dmitriy Drusvyatskiy

arXiv:1803.06523·math.OC·August 28, 2018

Stochastic model-based minimization of weakly convex functions

Damek Davis, Dmitriy Drusvyatskiy

PDF

TL;DR

This paper introduces a unified analysis framework for stochastic algorithms minimizing weakly convex functions, establishing convergence rates and complexity guarantees for several classical methods through implicit smoothing techniques.

Contribution

It provides the first complexity guarantees for stochastic proximal point, subgradient, and Gauss-Newton methods on composite functions, using a novel implicit smoothing approach.

Findings

01

Algorithms achieve an $O(k^{-1/4})$ stationarity measure convergence rate.

02

First complexity bounds for stochastic proximal point, subgradient, and Gauss-Newton methods.

03

Convergence rate for stochastic projected gradient without batching on smooth convex sets.

Abstract

We consider a family of algorithms that successively sample and minimize simple stochastic models of the objective function. We show that under reasonable conditions on approximation quality and regularity of the models, any such algorithm drives a natural stationarity measure to zero at the rate $O (k^{- 1/4})$ . As a consequence, we obtain the first complexity guarantees for the stochastic proximal point, proximal subgradient, and regularized Gauss-Newton methods for minimizing compositions of convex functions with smooth maps. The guiding principle, underlying the complexity guarantees, is that all algorithms under consideration can be interpreted as approximate descent methods on an implicit smoothing of the problem, given by the Moreau envelope. Specializing to classical circumstances, we obtain the long-sought convergence rate of the stochastic projected gradient method, without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.