Self-Repairing Disk Arrays

Jehan-Fran\c{c}ois P\^aris; Ahmed Amer; Darrell D. E. Long; Thomas J.; E. Schwarz

arXiv:1501.00513·cs.DC·January 6, 2015

Self-Repairing Disk Arrays

Jehan-Fran\c{c}ois P\^aris, Ahmed Amer, Darrell D. E. Long, Thomas J., E. Schwarz

PDF

Open Access

TL;DR

This paper proposes self-repairing disk arrays with sufficient spare disks to eliminate human intervention, demonstrating high data durability through simulation under realistic failure models.

Contribution

It introduces a design for self-repairing disk arrays using a specific number of spare disks, evaluated via simulation for high data reliability without manual repairs.

Findings

01

Achieves 99.999% data survival probability over four years

02

Requires n(n+1)/2 spare disks for reliability goals

03

RAID level 6 cannot meet the same reliability without triple failure tolerance

Abstract

As the prices of magnetic storage continue to decrease, the cost of replacing failed disks becomes increasingly dominated by the cost of the service call itself. We propose to eliminate these calls by building disk arrays that contain enough spare disks to operate without any human intervention during their whole lifetime. To evaluate the feasibility of this approach, we have simulated the behavior of two-dimensional disk arrays with n parity disks and n(n-1)/2 data disks under realistic failure and repair assumptions. Our conclusion is that having n(n+1)/2 spare disks is more than enough to achieve a 99.999 percent probability of not losing data over four years. We observe that the same objectives cannot be reached with RAID level 6 organizations and would require RAID stripes that could tolerate triple disk failures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Distributed systems and fault tolerance · Caching and Content Delivery