Stochastic subgradient method converges on tame functions

Damek Davis; Dmitriy Drusvyatskiy; Sham Kakade; Jason D. Lee

arXiv:1804.07795·math.OC·May 29, 2018

Stochastic subgradient method converges on tame functions

Damek Davis, Dmitriy Drusvyatskiy, Sham Kakade, Jason D. Lee

PDF

1 Repo

TL;DR

This paper proves that the stochastic subgradient method converges to stationary points on a broad class of non-smooth, non-convex functions, including those used in deep learning, by leveraging properties of tame functions.

Contribution

It establishes convergence guarantees for the stochastic subgradient method on semialgebraic and Whitney stratifiable functions, extending its applicability to many data science problems.

Findings

01

Limit points are all first-order stationary.

02

Applicable to deep learning architectures.

03

Provides rigorous convergence guarantees for non-smooth, non-convex functions.

Abstract

This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity? We prove that the stochastic subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic subgradient method, and its proximal extension, with rigorous convergence guarantees for a wide class of problems arising in data science---including all popular deep learning architectures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IBM/FormalML
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.