Characterization and Architectural Implications of Big Data Workloads
Lei Wang, Jianfeng Zhan, Zhen Jia, Rui Han

TL;DR
This paper characterizes big data workloads, revealing their data movement dominance, high front-end stalls, and the impact of software stacks on processor efficiency, based on a reduced set of representative workloads.
Contribution
It reduces 77 big data workloads to 17 representative ones and provides a detailed micro-architectural comparison with traditional workloads.
Findings
Big data workloads are data movement dominated with high branch operations.
Hadoop and Spark workloads exhibit higher front-end stalls.
Software stack complexity significantly affects processor efficiency.
Abstract
Big data areas are expanding in a fast way in terms of increasing workloads and runtime systems, and this situation imposes a serious challenge to workload characterization, which is the foundation of innovative system and architecture design. The previous major efforts on big data benchmarking either propose a comprehensive but a large amount of workloads, or only select a few workloads according to so-called popularity, which may lead to partial or even biased observations. In this paper, on the basis of a comprehensive big data benchmark suite---BigDataBench, we reduced 77 workloads to 17 representative workloads from a micro-architectural perspective. On a typical state-of-practice platform---Intel Xeon E5645, we compare the representative big data workloads with SPECINT, SPECCFP, PARSEC, CloudSuite and HPCC. After a comprehensive workload characterization, we have the following…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
