Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter
Zeyuan Allen-Zhu

TL;DR
This paper introduces stochastic optimization methods tailored for nonconvex functions, with convergence rates that adapt based on the function's degree of nonconvexity, characterized by the Hessian's smallest eigenvalue.
Contribution
The authors propose new stochastic first-order algorithms whose convergence depends on the Hessian's smallest eigenvalue, outperforming existing methods across different nonconvexity regimes.
Findings
Methods outperform known results for various nonconvexity levels.
Convergence rates depend on the eigenvalue parameter, showing a dichotomy at threshold .
Different scaling behaviors for and regimes.
Abstract
Given a nonconvex function that is an average of smooth functions, we design stochastic first-order methods to find its approximate stationary points. The convergence of our new methods depends on the smallest (negative) eigenvalue of the Hessian, a parameter that describes how nonconvex the function is. Our methods outperform known results for a range of parameter , and can be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold so that the currently fastest methods for and for have different behaviors: the former scales with and the latter scales with .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
