Improving Stochastic Cubic Newton with Momentum
El Mahdi Chayti, Nikita Doikov, Martin Jaggi

TL;DR
This paper introduces a stochastic cubic Newton method with momentum that stabilizes estimates, achieves global convergence with minimal data per iteration, and improves speed on convex problems, advancing stochastic second-order optimization.
Contribution
It is the first to prove global convergence of stochastic cubic Newton methods with arbitrary batch sizes in non-convex optimization using momentum.
Findings
Momentum reduces variance of stochastic estimates.
Global convergence achieved with single sample per iteration.
Improved speed on convex stochastic problems.
Abstract
We study stochastic second-order methods for solving general non-convex optimization problems. We propose using a special version of momentum to stabilize the stochastic gradient and Hessian estimates in Newton's method. We show that momentum provably improves the variance of stochastic estimates and allows the method to converge for any noise level. Using the cubic regularization technique, we prove a global convergence rate for our method on general non-convex problems to a second-order stationary point, even when using only a single stochastic data sample per iteration. This starkly contrasts with all existing stochastic second-order methods for non-convex problems, which typically require large batches. Therefore, we are the first to demonstrate global convergence for batches of arbitrary size in the non-convex case for the Stochastic Cubic Newton. Additionally, we show improved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
