Locality-Aware Hybrid Coded MapReduce for Server-Rack Architecture
Sneh Gupta, V. Lalitha

TL;DR
This paper introduces a hybrid coded MapReduce scheme tailored for server-rack architectures, balancing intra- and cross-rack communication costs, and optimizes task assignment to improve data locality and reduce overall communication overhead.
Contribution
It proposes a novel hybrid coded MapReduce scheme that reduces cross-rack communication and formulates an optimization problem for task assignment to enhance data locality.
Findings
Hybrid scheme reduces cross-rack communication significantly.
Optimization improves data locality in task assignment.
Simulation results confirm efficiency gains.
Abstract
MapReduce is a widely used framework for distributed computing. Data shuffling between the Map phase and Reduce phase of a job involves a large amount of data transfer across servers, which in turn accounts for increase in job completion time. Recently, Coded MapReduce has been proposed to offer savings with respect to the communication cost incurred in data shuffling. This is achieved by creating coded multicast opportunities for shuffling through repeating Map tasks at multiple servers. We consider a server-rack architecture for MapReduce and in this architecture, propose to divide the total communication cost into two: intra-rack communication cost and cross-rack communication cost. Having noted that cross-rack data transfer operates at lower speed as compared to intra-rack data transfer, we present a scheme termed as Hybrid Coded MapReduce which results in lower cross-rack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Cryptography and Data Security
