A Linearly-Convergent Stochastic L-BFGS Algorithm
Philipp Moritz, Robert Nishihara, Michael I. Jordan

TL;DR
This paper introduces a stochastic L-BFGS algorithm with proven linear convergence for strongly convex functions, combining variance reduction techniques and demonstrating strong empirical performance on large-scale problems.
Contribution
It presents a novel stochastic L-BFGS algorithm with a proven linear convergence rate, integrating variance reduction methods for improved optimization efficiency.
Findings
Algorithm exhibits linear convergence on convex problems.
Performs well on large-scale convex and non-convex problems.
Effective across a wide range of step sizes.
Abstract
We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Optimization and Search Problems
