Complexity Measures for Map-Reduce, and Comparison to Parallel Computing

Ashish Goel; Kamesh Munagala

arXiv:1211.6526·cs.DC·November 29, 2012·30 cites

Complexity Measures for Map-Reduce, and Comparison to Parallel Computing

Ashish Goel, Kamesh Munagala

PDF

Open Access

TL;DR

This paper introduces formal complexity measures for Map-Reduce algorithms, highlighting their differences from traditional parallel models like PRAM, to better understand and compare large-scale data processing methods.

Contribution

It proposes specific complexity measures for Map-Reduce and clarifies how this model fundamentally differs from other parallel computing frameworks.

Findings

01

Complexity measures enable fine-grained algorithm analysis

02

Map-Reduce differs significantly from PRAM in computational modeling

03

The proposed measures balance detail and abstraction in analysis

Abstract

The programming paradigm Map-Reduce and its main open-source implementation, Hadoop, have had an enormous impact on large scale data processing. Our goal in this expository writeup is two-fold: first, we want to present some complexity measures that allow us to talk about Map-Reduce algorithms formally, and second, we want to point out why this model is actually different from other models of parallel programming, most notably the PRAM (Parallel Random Access Memory) model. We are looking for complexity measures that are detailed enough to make fine-grained distinction between different algorithms, but which also abstract away many of the implementation details.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies