Distributed Linearly Separable Computation with Arbitrary Heterogeneous Data Assignment
Ziting Zhang, Kai Wan, Minquan Cheng, Shuo Shao, Giuseppe Caire

TL;DR
This paper investigates the fundamental tradeoff between computation and communication in distributed systems performing linearly separable tasks with arbitrary, heterogeneous data assignments across workers, proposing universal schemes and bounds.
Contribution
It introduces a universal computing scheme and converse bounds for heterogeneous data assignments, extending to fractional communication costs, in distributed linearly separable computation.
Findings
Universal computing scheme matches converse bounds in some regimes.
Characterization of data assignment structure is key to tradeoff analysis.
Extension of schemes to fractional communication costs.
Abstract
Distributed linearly separable computation is a fundamental problem in large-scale distributed systems, requiring the computation of linearly separable functions over different datasets across distributed workers. This paper studies a heterogeneous distributed linearly separable computation problem, including one master and N distributed workers. The linearly separable task function involves Kc linear combinations of K messages, where each message is a function of one dataset. Distinguished from the existing homogeneous settings that assume each worker holds the same number of datasets, where the data assignment is carefully designed and controlled by the data center (e.g., the cyclic assignment), we consider a more general setting with arbitrary heterogeneous data assignment across workers, where `arbitrary' means that the data assignment is given in advance and `heterogeneous' means…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed systems and fault tolerance · Distributed and Parallel Computing Systems
