A Progressive Batching L-BFGS Method for Machine Learning
Raghu Bollapragada, Dheevatsa Mudigere, Jorge Nocedal, Hao-Jun Michael, Shi, Ping Tak Peter Tang

TL;DR
This paper introduces a new progressive batching L-BFGS algorithm that combines stochastic line search and stable quasi-Newton updates, improving large-scale machine learning optimization.
Contribution
It proposes a novel progressive batching L-BFGS method with convergence guarantees, bridging the gap between full batch and stochastic approaches.
Findings
Performs well on logistic regression and neural networks
Offers convergence guarantees for the proposed method
Balances efficiency and stability in large-scale optimization
Abstract
The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
