Exact and Inexact Subsampled Newton Methods for Optimization
Raghu Bollapragada, Richard Byrd, Jorge Nocedal

TL;DR
This paper investigates subsampled Newton methods for stochastic optimization, analyzing their convergence, complexity, and practical performance in machine learning tasks like logistic regression.
Contribution
It introduces a superlinear convergence analysis for Newton-like methods with subsampled derivatives and evaluates an inexact Newton approach using conjugate gradient in this context.
Findings
Superlinear convergence achieved with proper gradient and Hessian accuracy coordination
Complexity analysis of inexact Newton method with Hessian sampling and CG
Preliminary results show promising performance on logistic regression tasks
Abstract
The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of the paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact). We provide a complexity analysis for this method based on the properties of the CG iteration and the quality of the Hessian approximation, and compare it with a method that employs a stochastic gradient iteration instead of the CG method. We report preliminary numerical results that illustrate the performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
