On the Worst-case Communication Overhead for Distributed Data Shuffling
Mohamed Attia, Ravi Tandon

TL;DR
This paper characterizes the minimum communication overhead in distributed data shuffling, proposing an optimal coding scheme that matches the theoretical lower bound in worst-case scenarios.
Contribution
It introduces a novel coded data delivery scheme for distributed data shuffling with no excess storage, achieving optimal worst-case communication overhead.
Findings
Proposed scheme matches the information-theoretic lower bound.
Applicable to any shuffle and number of workers.
Provides a fundamental limit on communication overhead.
Abstract
Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing across distributed workers to achieve speed-up and efficiency. Several computational tasks are of sequential nature, and involve multiple passes over the data. At each iteration over the data, it is common practice to randomly re-shuffle the data at the master node, assigning different batches for each worker to process. This random re-shuffling operation comes at the cost of extra communication overhead, since at each shuffle, new data points need to be delivered to the distributed workers. In this paper, we focus on characterizing the information theoretically optimal communication overhead for the distributed data shuffling problem. We propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
