Faster Perturbed Stochastic Gradient Methods for Finding Local Minima
Zixiang Chen, Dongruo Zhou, Quanquan Gu

TL;DR
This paper introduces LENA, a new perturbed stochastic gradient framework that accelerates finding local minima in nonconvex optimization by reducing stochastic gradient complexity using a step-size shrinkage scheme.
Contribution
LENA is a novel framework that improves stochastic gradient complexity for finding local minima, utilizing step-size shrinkage and compatible with various gradient estimators.
Findings
LENA achieves $ ilde O( ext{epsilon}^{-3} + ext{epsilon}_H^{-6})$ complexity.
LENA outperforms previous methods with $ ilde O( ext{epsilon}^{-3.5})$ complexity.
The step-size shrinkage scheme is key to faster convergence.
Abstract
Escaping from saddle points and finding local minimum is a central problem in nonconvex optimization. Perturbed gradient methods are perhaps the simplest approach for this problem. However, to find -approximate local minima, the existing best stochastic gradient complexity for this type of algorithms is , which is not optimal. In this paper, we propose LENA (Last stEp shriNkAge), a faster perturbed stochastic gradient framework for finding local minima. We show that LENA with stochastic gradient estimators such as SARAH/SPIDER and STORM can find -approximate local minima within stochastic gradient evaluations (or when ). The core idea of our framework is a step-size shrinkage scheme to control the average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs
