Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon, Lacoste-Julien

TL;DR
This paper analyzes stochastic second-order methods for over-parameterized models, demonstrating their fast convergence properties under interpolation, with theoretical guarantees and empirical validation on classification tasks.
Contribution
It introduces R-SSN with adaptive step-size and batch size growth, proving its linear and quadratic convergence, and analyzes stochastic BFGS methods in the interpolation setting.
Findings
R-SSN achieves global linear convergence with constant batch size.
Growing batch size enables quadratic convergence locally.
Stochastic BFGS attains global linear convergence.
Abstract
We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition satisfied by over-parameterized models. Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyze stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic L-BFGS and a "Hessian-free" implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research
