Management of Data Replication for PC Cluster-based Cloud Storage System

Julia Myint; Thinn Thu Naing

arXiv:1112.5917·cs.DC·December 30, 2011

Management of Data Replication for PC Cluster-based Cloud Storage System

Julia Myint, Thinn Thu Naing

PDF

Open Access

TL;DR

This paper presents an enhanced data replication management scheme for a cost-effective PC cluster-based cloud storage system using HDFS, optimizing data distribution, balancing, and fault tolerance.

Contribution

It introduces a novel replication management scheme that improves data balancing and fault tolerance in inexpensive PC cluster cloud storage systems.

Findings

01

Storage balancing depends on disk space and node failure probability.

02

Optimized replica number improves data availability.

03

System effectively balances load across commodity nodes.

Abstract

Storage systems are essential building blocks for cloud computing infrastructures. Although high performance storage servers are the ultimate solution for cloud storage, the implementation of inexpensive storage system remains an open issue. To address this problem, the efficient cloud storage system is implemented with inexpensive and commodity computer nodes that are organized into PC cluster based datacenter. Hadoop Distributed File System (HDFS) is an open source cloud based storage platform and designed to be deployed in low-cost hardware. PC Cluster based Cloud Storage System is implemented with HDFS by enhancing replication management scheme. Data objects are distributed and replicated in a cluster of commodity nodes located in the cloud. This system provides optimum replica number as well as weighting and balancing among the storage server nodes. The experimental results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Caching and Content Delivery · Advanced Data Storage Technologies