Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Zixiang Chen; Dongruo Zhou; Quanquan Gu

arXiv:2110.13144·math.OC·April 21, 2022

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Zixiang Chen, Dongruo Zhou, Quanquan Gu

PDF

Open Access

TL;DR

This paper introduces LENA, a new perturbed stochastic gradient framework that accelerates finding local minima in nonconvex optimization by reducing stochastic gradient complexity using a step-size shrinkage scheme.

Contribution

LENA is a novel framework that improves stochastic gradient complexity for finding local minima, utilizing step-size shrinkage and compatible with various gradient estimators.

Findings

01

LENA achieves $ ilde O( ext{epsilon}^{-3} + ext{epsilon}_H^{-6})$ complexity.

02

LENA outperforms previous methods with $ ilde O( ext{epsilon}^{-3.5})$ complexity.

03

The step-size shrinkage scheme is key to faster convergence.

Abstract

Escaping from saddle points and finding local minimum is a central problem in nonconvex optimization. Perturbed gradient methods are perhaps the simplest approach for this problem. However, to find $(ϵ, ϵ)$ -approximate local minima, the existing best stochastic gradient complexity for this type of algorithms is $\tilde{O} (ϵ^{- 3.5})$ , which is not optimal. In this paper, we propose LENA (Last stEp shriNkAge), a faster perturbed stochastic gradient framework for finding local minima. We show that LENA with stochastic gradient estimators such as SARAH/SPIDER and STORM can find $(ϵ, ϵ_{H})$ -approximate local minima within $\tilde{O} (ϵ^{- 3} + ϵ_{H}^{- 6})$ stochastic gradient evaluations (or $\tilde{O} (ϵ^{- 3})$ when $ϵ_{H} = ϵ$ ). The core idea of our framework is a step-size shrinkage scheme to control the average…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs