The Storage vs Repair Bandwidth Trade-off for Multiple Failures in   Clustered Storage Networks

Vitaly Abdrashitov; N. Prakash; Muriel M\'edard

arXiv:1708.05474·cs.IT·August 21, 2017

The Storage vs Repair Bandwidth Trade-off for Multiple Failures in Clustered Storage Networks

Vitaly Abdrashitov, N. Prakash, Muriel M\'edard

PDF

TL;DR

This paper investigates the balance between storage efficiency and repair bandwidth in clustered storage systems, especially when repairing multiple node failures, providing bounds and insights into optimal repair strategies.

Contribution

It characterizes the optimal storage-bandwidth trade-off for multiple failures in clustered storage, including exact and functional repair, revealing key conditions affecting system capacity.

Findings

01

Trade-off same as single failure when t divides (m-ell)

02

Exact repair at MBR can have less file size than functional repair

03

More local helpers do not always increase capacity under functional repair

Abstract

We study the trade-off between storage overhead and inter-cluster repair bandwidth in clustered storage systems, while recovering from multiple node failures within a cluster. A cluster is a collection of $m$ nodes, and there are $n$ clusters. For data collection, we download the entire content from any $k$ clusters. For repair of $t \geq 2$ nodes within a cluster, we take help from $ℓ$ local nodes, as well as $d$ helper clusters. We characterize the optimal trade-off under functional repair, and also under exact repair for the minimum storage and minimum inter-cluster bandwidth (MBR) operating points. Our bounds show the following interesting facts: $1)$ When $t ∣ (m - ℓ)$ the trade-off is the same as that under $t = 1$ , and thus there is no advantage in jointly repairing multiple nodes, $2)$ When $t ∤ (m - ℓ)$ , the optimal file-size at the MBR point under exact repair can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.