Asynchronous Parallel Stochastic Quasi-Newton Methods
Qianqian Tong, Guannan Liang, Xingyu Cai, Chunjiang Zhu, Jinbo Bi

TL;DR
This paper introduces an asynchronous parallel stochastic quasi-Newton method that effectively parallelizes L-BFGS with convergence guarantees, achieving significant speedup and better performance on ill-conditioned problems.
Contribution
It presents the first truly parallelized L-BFGS algorithm with proven convergence and demonstrates its efficiency over existing stochastic methods.
Findings
Achieves linear convergence rate with parallelization.
Demonstrates significant speedup in empirical tests.
Outperforms first-order methods on ill-conditioned problems.
Abstract
Although first-order stochastic algorithms, such as stochastic gradient descent, have been the main force to scale up machine learning models, such as deep neural nets, the second-order quasi-Newton methods start to draw attention due to their effectiveness in dealing with ill-conditioned optimization problems. The L-BFGS method is one of the most widely used quasi-Newton methods. We propose an asynchronous parallel algorithm for stochastic quasi-Newton (AsySQN) method. Unlike prior attempts, which parallelize only the calculation for gradient or the two-loop recursion of L-BFGS, our algorithm is the first one that truly parallelizes L-BFGS with a convergence guarantee. Adopting the variance reduction technique, a prior stochastic L-BFGS, which has not been designed for parallel computing, reaches a linear convergence rate. We prove that our asynchronous parallel scheme maintains the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
