Compressed Coded Distributed Computing
Songze Li, Mohammad Ali Maddah-Ali, A. Salman Avestimehr

TL;DR
This paper introduces compressed coded distributed computing (CDC), a novel scheme that combines compression and coding techniques to significantly reduce communication load in distributed machine learning tasks.
Contribution
It proposes a new scheme that jointly exploits compression and coding across intermediate results, outperforming existing methods in reducing communication load.
Findings
Compressed CDC reduces communication load more than traditional methods.
The scheme is effective for linear aggregation tasks common in machine learning.
Achieves theoretical characterization of communication load improvements.
Abstract
Communication overhead is one of the major performance bottlenecks in large-scale distributed computing systems, in particular for machine learning applications. Conventionally, compression techniques are used to reduce the load of communication by combining intermediate results of the same computation task as much as possible. Recently, via the development of coded distributed computing (CDC), it has been shown that it is possible to enable coding opportunities across intermediate results of different computation tasks to further reduce the communication load. We propose a new scheme, named compressed coded distributed computing (in short, compressed CDC), which jointly exploits the above two techniques (i.e., combining the intermediate results of the same computation and coding across the intermediate results of different computations) to significantly reduce the communication load…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
