Stochastic Cubic Regularization for Fast Nonconvex Optimization
Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael, I. Jordan

TL;DR
This paper introduces a stochastic cubic-regularized Newton method that efficiently escapes saddle points and finds local minima in nonconvex optimization with fewer evaluations than traditional stochastic gradient descent.
Contribution
It presents a stochastic variant of the cubic-regularized Newton method that achieves faster convergence rates without complex acceleration or variance reduction.
Findings
Achieves $ ilde{O}( ext{epsilon}^{-3.5})$ complexity for finding local minima.
Requires stochastic gradient and Hessian-vector product evaluations as efficiently as stochastic gradients.
Improves upon the $ ilde{O}( ext{epsilon}^{-4})$ rate of stochastic gradient descent.
Abstract
This paper proposes a stochastic variant of a classic algorithm---the cubic-regularized Newton method [Nesterov and Polyak 2006]. The proposed algorithm efficiently escapes saddle points and finds approximate local minima for general smooth, nonconvex functions in only stochastic gradient and stochastic Hessian-vector product evaluations. The latter can be computed as efficiently as stochastic gradients. This improves upon the rate of stochastic gradient descent. Our rate matches the best-known result for finding local minima without requiring any delicate acceleration or variance-reduction techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs
