Complexity reduction in online stochastic Newton methods with potential O(N d) total cost
Antoine Godichon-Baggioni (LPSM (UMR\_8001)), Bruno Portier (LMI), Guillaume Sall\'e (LMI, LPSM (UMR\_8001))

TL;DR
This paper proposes an online stochastic Newton method with a random masking strategy that reduces computational cost to O(N d), enabling efficient second-order optimization in high-dimensional stochastic convex problems.
Contribution
It introduces a novel mini-batch stochastic Newton algorithm with a random Hessian column sampling technique, achieving low total computational cost while maintaining convergence guarantees.
Findings
Achieves O(N d) total cost for a single data pass
Converges almost surely and is asymptotically efficient
Does not require iterate averaging for convergence
Abstract
Optimizing smooth convex functions in stochastic settings, where only noisy estimates of gradients and Hessians are available, is a fundamental problem in optimization. While first-order methods possess a low per-iteration cost, their convergence is slow for ill-conditioned problems. Stochastic Newton methods utilize second-order information to correct for local curvature, but the O(d 3 ) per-iteration cost of computing and inverting a full Hessian, where d is the problem dimension, is prohibitive in high dimensions. This paper introduces an online mini-batch stochastic Newton algorithm. The method employs a random masking strategy that selects a subset of Hessian columns at each iteration, substantially reducing the per-step computational cost. This approach allows the algorithm, in the mini-batch setting, to achieve a total computational cost for a single pass over N data points of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference
