An Analysis of Asynchronous Stochastic Accelerated Coordinate Descent
Richard Cole, Yixin Tao

TL;DR
This paper demonstrates that asynchronous parallel acceleration can be effectively combined with coordinate descent, achieving linear and sublinear speedups for convex and strongly convex functions under bounded asynchrony.
Contribution
It provides the first analysis of asynchronous parallel accelerated coordinate descent, showing how acceleration and parallelism can be combined effectively.
Findings
Linear speedup for strongly convex functions with limited asynchrony.
Sublinear speedup for strongly convex functions with larger asynchrony.
Sublinear speedup for general convex functions.
Abstract
Gradient descent, and coordinate descent in particular, are core tools in machine learning and elsewhere. Large problem instances are common. To help solve them, two orthogonal approaches are known: acceleration and parallelism. In this work, we ask whether they can be used simultaneously. The answer is "yes". More specifically, we consider an asynchronous parallel version of the accelerated coordinate descent algorithm proposed and analyzed by Lin, Liu and Xiao (SIOPT'15). We give an analysis based on the efficient implementation of this algorithm. The only constraint is a standard bounded asynchrony assumption, namely that each update can overlap with at most q others. (q is at most the number of processors times the ratio in the lengths of the longest and shortest updates.) We obtain the following three results: 1. A linear speedup for strongly convex functions so long as q is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs · Optimization and Search Problems
