Data-Intensive Workload Consolidation on Hadoop Distributed File System

Reza Moraveji; Javid Taheri; MohammadReza HosseinyFarahabady; Nikzad; Babaii Rizvandi; Albert Y. Zomaya

arXiv:1303.7270·cs.DC·November 15, 2016

Data-Intensive Workload Consolidation on Hadoop Distributed File System

Reza Moraveji, Javid Taheri, MohammadReza HosseinyFarahabady, Nikzad, Babaii Rizvandi, Albert Y. Zomaya

PDF

TL;DR

This paper explores workload consolidation challenges in Hadoop, analyzing cache contention and throughput, and proposes a greedy algorithm to optimize server utilization with promising results.

Contribution

It systematically investigates consolidation challenges in Hadoop, models the problem as a bin packing task, and introduces an efficient greedy algorithm for near-optimal server utilization.

Findings

01

Greedy algorithm achieves near-optimal solutions.

02

Cache contention impacts throughput in consolidated workloads.

03

Modeling consolidation as bin packing is effective.

Abstract

Workload consolidation, sharing physical resources among multiple workloads, is a promising technique to save cost and energy in cluster computing systems. This paper highlights a few challenges of workload consolidation for Hadoop as one of the current state-of-the-art data-intensive cluster computing system. Through a systematic step-by-step procedure, we investigate challenges for efficient server consolidation in Hadoop environments. To this end, we first investigate the inter-relationship between last level cache (LLC) contention and throughput degradation for consolidated workloads on a single physical server employing Hadoop distributed file system (HDFS). We then investigate the general case of consolidation on multiple physical servers so that their throughput never falls below a desired/predefined utilization level. We use our empirical results to model consolidation as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.