Fundamental Limits of Decentralized Data Shuffling

Kai Wan; Daniela Tuninetti; Mingyue Ji; Giuseppe Caire; Pablo; Piantanida

arXiv:1807.00056·cs.IT·January 15, 2020

Fundamental Limits of Decentralized Data Shuffling

Kai Wan, Daniela Tuninetti, Mingyue Ji, Giuseppe Caire, Pablo, Piantanida

PDF

TL;DR

This paper investigates the fundamental limits of decentralized data shuffling in distributed machine learning, proposing bounds and schemes that optimize communication efficiency under storage constraints and multiple workers.

Contribution

It introduces novel converse bounds and achievable schemes for decentralized data shuffling with asymmetric storage, achieving near-optimal performance under specific conditions.

Findings

01

Bounds within 3/2 factor of each other for uncoded storage

02

Schemes are optimal for large storage or up to four workers

03

Decentralized approach reduces communication load compared to master-worker models

Abstract

Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large-scale machine learning algorithms. Data shuffling is often considered as one of the most significant bottlenecks in such systems due to the heavy communication load. Under a master-worker architecture (where a master has access to the entire dataset and only communication between the master and the workers is allowed) coding has been recently proved to considerably reduce the communication load. This work considers a different communication paradigm referred to as decentralized data shuffling, where workers are allowed to communicate with one another via a shared link. The decentralized data shuffling problem has two phases: workers communicate with each other during the data shuffling phase, and then workers update their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.