Scheduling Distributed Clusters of Parallel Machines: Primal-Dual and LP-based Approximation Algorithms [Full Version]
Riley Murray, Samir Khuller, Megan Chao

TL;DR
This paper introduces two novel approximation algorithms for scheduling jobs across multiple distributed clusters of parallel machines, effectively minimizing weighted average completion time in large-scale data processing scenarios.
Contribution
It presents the first constant factor approximation algorithms for the complex distributed cluster scheduling problem, using LP-based and mapping-based approaches.
Findings
LP-based algorithm offers strong performance guarantees.
Mapping-based algorithm is simple and computationally fast.
First constant factor approximations for this problem.
Abstract
The Map-Reduce computing framework rose to prominence with datasets of such size that dozens of machines on a single cluster were needed for individual jobs. As datasets approach the exabyte scale, a single job may need distributed processing not only on multiple machines, but on multiple clusters. We consider a scheduling problem to minimize weighted average completion time of N jobs on M distributed clusters of parallel machines. In keeping with the scale of the problems motivating this work, we assume that (1) each job is divided into M "subjobs" and (2) distinct subjobs of a given job may be processed concurrently. When each cluster is a single machine, this is the NP-Hard concurrent open shop problem. A clear limitation of such a model is that a serial processing assumption sidesteps the issue of how different tasks of a given subjob might be processed in parallel. Our algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
