Management of Data Replication for PC Cluster-based Cloud Storage System
Julia Myint, Thinn Thu Naing

TL;DR
This paper presents an enhanced data replication management scheme for a cost-effective PC cluster-based cloud storage system using HDFS, optimizing data distribution, balancing, and fault tolerance.
Contribution
It introduces a novel replication management scheme that improves data balancing and fault tolerance in inexpensive PC cluster cloud storage systems.
Findings
Storage balancing depends on disk space and node failure probability.
Optimized replica number improves data availability.
System effectively balances load across commodity nodes.
Abstract
Storage systems are essential building blocks for cloud computing infrastructures. Although high performance storage servers are the ultimate solution for cloud storage, the implementation of inexpensive storage system remains an open issue. To address this problem, the efficient cloud storage system is implemented with inexpensive and commodity computer nodes that are organized into PC cluster based datacenter. Hadoop Distributed File System (HDFS) is an open source cloud based storage platform and designed to be deployed in low-cost hardware. PC Cluster based Cloud Storage System is implemented with HDFS by enhancing replication management scheme. Data objects are distributed and replicated in a cluster of commodity nodes located in the cloud. This system provides optimum replica number as well as weighting and balancing among the storage server nodes. The experimental results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Caching and Content Delivery · Advanced Data Storage Technologies
