Capacity of Clustered Distributed Storage
Jy-yong Sohn, Beongjun Choi, Sung Whan Yoon, Jaekyun Moon

TL;DR
This paper models the capacity of clustered distributed storage systems, analyzing how intra- and cross-cluster repair bandwidths affect storage efficiency and proposing intra-cluster repairable codes to minimize cross-cluster traffic.
Contribution
It introduces a new clustered storage model, derives its capacity as a function of key resources, and proposes intra-cluster repairable codes to reduce cross-cluster bandwidth usage.
Findings
Capacity decreases with the number of clusters as system size grows.
Feasible resource sets for reliable storage are characterized in closed form.
Zero cross-cluster traffic is achievable with additional resources, via intra-cluster repairable codes.
Abstract
A new system model reflecting the clustered structure of distributed storage is suggested to investigate interplay between storage overhead and repair bandwidth as storage node failures occur. Large data centers with multiple racks/disks or local networks of storage devices (e.g. sensor network) are good applications of the suggested clustered model. In realistic scenarios involving clustered storage structures, repairing storage nodes using intact nodes residing in other clusters is more bandwidth-consuming than restoring nodes based on information from intra-cluster nodes. Therefore, it is important to differentiate between intra-cluster repair bandwidth and cross-cluster repair bandwidth in modeling distributed storage. Capacity of the suggested model is obtained as a function of fundamental resources of distributed storage systems, namely, node storage capacity, intra-cluster repair…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
