Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach

Sandeep Kumar; Sindhu Padakandla; Chandrashekar L; Priyank Parihar; K; Gopinath; Shalabh Bhatnagar

arXiv:1611.10052·cs.DC·August 28, 2019

Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach

Sandeep Kumar, Sindhu Padakandla, Chandrashekar L, Priyank Parihar, K, Gopinath, Shalabh Bhatnagar

PDF

TL;DR

This paper introduces a novel parameter tuning method for Hadoop MapReduce using a noisy gradient approach, significantly improving performance by reducing execution times through automatic, dimension-free optimization.

Contribution

It presents a new tuning methodology based on SPSA that effectively handles cross-parameter interactions and large search spaces in Hadoop configurations.

Findings

01

Achieved 66% average reduction in Hadoop job execution time.

02

Reduced execution times by 45% compared to previous tuning methods.

03

Validated effectiveness on multiple Hadoop benchmarks.

Abstract

Hadoop MapReduce is a framework for distributed storage and processing of large datasets that is quite popular in big data analytics. It has various configuration parameters (knobs) which play an important role in deciding the performance i.e., the execution time of a given big data processing job. Default values of these parameters do not always result in good performance and hence it is important to tune them. However, there is inherent difficulty in tuning the parameters due to two important reasons - firstly, the parameter search space is large and secondly, there are cross-parameter interactions. Hence, there is a need for a dimensionality-free method which can automatically tune the configuration parameters by taking into account the cross-parameter dependencies. In this paper, we propose a novel Hadoop parameter tuning methodology, based on a noisy gradient algorithm known as the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.