Rack-Aware Regenerating Codes with Multiple Erasure Tolerance
Liyang Zhou, Zhifang Zhang

TL;DR
This paper introduces a relaxed rack-aware regenerating code model that tolerates multiple node failures, reducing intra- and cross-rack repair bandwidth, with explicit code constructions and systematic encoding for practical use.
Contribution
It proposes a new relaxed repair model for rack-aware codes that can handle multiple failures and provides explicit code constructions with systematic encoding.
Findings
Derived a tradeoff between storage and repair bandwidth under the relaxed model.
Characterized parameters at the extreme points for minimum storage and minimum bandwidth.
Constructed explicit codes with low sub-packetization and practical systematic encoding.
Abstract
In a modern distributed storage system, storage nodes are organized in racks, and the cross-rack communication dominates the system bandwidth. In We study the rack-aware storage system where all storage nodes are organized in racks and within each rack the nodes can communicate freely without taxing the system bandwidth. Rack-aware regenerating codes (RRCs) were proposed for minimizing the repair bandwidth for single erasures. In the initial setting of RRCs, the repair of a single node requires the participation of all the remaining nodes in the rack containing the failed node as well as a large number of helper racks containing no failures. Consequently, the repair may be infeasible in front of multiple node failures. In this work, a relaxed repair model that can tolerate multiple node failures by simultaneously reducing the intra-rack connections and cross-rack connections is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
