Asynchronous Parallel Stochastic Quasi-Newton Methods

Qianqian Tong; Guannan Liang; Xingyu Cai; Chunjiang Zhu; Jinbo Bi

arXiv:2011.00667·math.OC·November 3, 2020

Asynchronous Parallel Stochastic Quasi-Newton Methods

Qianqian Tong, Guannan Liang, Xingyu Cai, Chunjiang Zhu, Jinbo Bi

PDF

Open Access

TL;DR

This paper introduces an asynchronous parallel stochastic quasi-Newton method that effectively parallelizes L-BFGS with convergence guarantees, achieving significant speedup and better performance on ill-conditioned problems.

Contribution

It presents the first truly parallelized L-BFGS algorithm with proven convergence and demonstrates its efficiency over existing stochastic methods.

Findings

01

Achieves linear convergence rate with parallelization.

02

Demonstrates significant speedup in empirical tests.

03

Outperforms first-order methods on ill-conditioned problems.

Abstract

Although first-order stochastic algorithms, such as stochastic gradient descent, have been the main force to scale up machine learning models, such as deep neural nets, the second-order quasi-Newton methods start to draw attention due to their effectiveness in dealing with ill-conditioned optimization problems. The L-BFGS method is one of the most widely used quasi-Newton methods. We propose an asynchronous parallel algorithm for stochastic quasi-Newton (AsySQN) method. Unlike prior attempts, which parallelize only the calculation for gradient or the two-loop recursion of L-BFGS, our algorithm is the first one that truly parallelizes L-BFGS with a convergence guarantee. Adopting the variance reduction technique, a prior stochastic L-BFGS, which has not been designed for parallel computing, reaches a linear convergence rate. We prove that our asynchronous parallel scheme maintains the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data