Complexity Measures for Map-Reduce, and Comparison to Parallel Computing
Ashish Goel, Kamesh Munagala

TL;DR
This paper introduces formal complexity measures for Map-Reduce algorithms, highlighting their differences from traditional parallel models like PRAM, to better understand and compare large-scale data processing methods.
Contribution
It proposes specific complexity measures for Map-Reduce and clarifies how this model fundamentally differs from other parallel computing frameworks.
Findings
Complexity measures enable fine-grained algorithm analysis
Map-Reduce differs significantly from PRAM in computational modeling
The proposed measures balance detail and abstraction in analysis
Abstract
The programming paradigm Map-Reduce and its main open-source implementation, Hadoop, have had an enormous impact on large scale data processing. Our goal in this expository writeup is two-fold: first, we want to present some complexity measures that allow us to talk about Map-Reduce algorithms formally, and second, we want to point out why this model is actually different from other models of parallel programming, most notably the PRAM (Parallel Random Access Memory) model. We are looking for complexity measures that are detailed enough to make fine-grained distinction between different algorithms, but which also abstract away many of the implementation details.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
