On Modelling and Prediction of Total CPU Usage for Applications in   MapReduce Environments

Nikzad Babaii Rizvandi; Javid Taheri; Reza Moraveji; Albert Y. Zomaya

arXiv:1203.4054·cs.DC·July 30, 2012·1 cites

On Modelling and Prediction of Total CPU Usage for Applications in MapReduce Environments

Nikzad Babaii Rizvandi, Javid Taheri, Reza Moraveji, Albert Y. Zomaya

PDF

Open Access

TL;DR

This paper presents a polynomial regression-based model to predict total CPU usage of MapReduce jobs, aiding resource provisioning and configuration parameter selection in cloud environments.

Contribution

It introduces a novel approach to model and predict total CPU usage based on configuration parameters and input data scaling in MapReduce environments.

Findings

01

Prediction accuracy within 8% of actual CPU usage

02

Model validated on three real-world applications

03

Input data scaling influences total CPU usage

Abstract

Recently, businesses have started using MapReduce as a popular computation framework for processing large amount of data, such as spam detection, and different data mining tasks, in both public and private clouds. Two of the challenging questions in such environments are (1) choosing suitable values for MapReduce configuration parameters -e.g., number of mappers, number of reducers, and DFS block size-, and (2) predicting the amount of resources that a user should lease from the service provider. Currently, the tasks of both choosing configuration parameters and estimating required resources are solely the users' responsibilities. In this paper, we present an approach to provision the total CPU usage in clock cycles of jobs in MapReduce environment. For a MapReduce job, a profile of total CPU usage in clock cycles is built from the job past executions with different values of two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Data Mining Algorithms and Applications · Big Data and Business Intelligence