Performance Issues of Heterogeneous Hadoop Clusters in Cloud Computing
B.Thirumala Rao, N.V.Sridevi, V.Krishna Reddy, L.S.S.Reddy

TL;DR
This paper investigates the performance challenges of Hadoop clusters with heterogeneous nodes in cloud computing environments and offers guidelines to mitigate these issues.
Contribution
It identifies key performance bottlenecks in heterogeneous Hadoop clusters and proposes strategies to improve efficiency and resource utilization.
Findings
Heterogeneous clusters face significant performance degradation.
Certain scheduling and resource management strategies can improve performance.
Guidelines help optimize Hadoop in diverse hardware environments.
Abstract
Nowadays most of the cloud applications process large amount of data to provide the desired results. Data volumes to be processed by cloud applications are growing much faster than computing power. This growth demands new strategies for processing and analyzing information. Dealing with large data volumes requires two things: 1) Inexpensive, reliable storage 2) New tools for analyzing unstructured and structured data. Hadoop is a powerful open source software platform that addresses both of these problems. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature. Hadoop lacks performance in heterogeneous clusters where the nodes have different computing capacity. In this paper we address the issues that affect the performance of hadoop in heterogeneous clusters and also provided some guidelines on how to overcome these bottlenecks
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
