Communication-Efficient Distributed Dual Coordinate Ascent

Martin Jaggi; Virginia Smith; Martin Tak\'a\v{c}; Jonathan Terhorst,; Sanjay Krishnan; Thomas Hofmann; Michael I. Jordan

arXiv:1409.1458·cs.LG·September 30, 2014·114 cites

Communication-Efficient Distributed Dual Coordinate Ascent

Martin Jaggi, Virginia Smith, Martin Tak\'a\v{c}, Jonathan Terhorst,, Sanjay Krishnan, Thomas Hofmann, Michael I. Jordan

PDF

Open Access

TL;DR

This paper introduces CoCoA, a communication-efficient distributed optimization framework that leverages local computation to significantly reduce communication costs, achieving faster convergence in large-scale machine learning tasks.

Contribution

The paper presents a novel primal-dual framework, CoCoA, with proven convergence rates and practical implementation, outperforming existing mini-batch algorithms in speed.

Findings

01

CoCoA converges 25 times faster than state-of-the-art methods.

02

Achieves the same solution quality with less communication.

03

Demonstrated effectiveness on real-world datasets.

Abstract

Communication remains the most significant bottleneck in the performance of distributed optimization algorithms for large-scale machine learning. In this paper, we propose a communication-efficient framework, CoCoA, that uses local computation in a primal-dual setting to dramatically reduce the amount of necessary communication. We provide a strong convergence rate analysis for this class of algorithms, as well as experiments on real-world distributed datasets with implementations in Spark. In our experiments, we find that as compared to state-of-the-art mini-batch versions of SGD and SDCA algorithms, CoCoA converges to the same .001-accurate solution quality on average 25x as quickly.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Sparse and Compressive Sensing Techniques

MethodsStochastic Gradient Descent