Self-Repairing Disk Arrays
Jehan-Fran\c{c}ois P\^aris, Ahmed Amer, Darrell D. E. Long, Thomas J., E. Schwarz

TL;DR
This paper proposes self-repairing disk arrays with sufficient spare disks to eliminate human intervention, demonstrating high data durability through simulation under realistic failure models.
Contribution
It introduces a design for self-repairing disk arrays using a specific number of spare disks, evaluated via simulation for high data reliability without manual repairs.
Findings
Achieves 99.999% data survival probability over four years
Requires n(n+1)/2 spare disks for reliability goals
RAID level 6 cannot meet the same reliability without triple failure tolerance
Abstract
As the prices of magnetic storage continue to decrease, the cost of replacing failed disks becomes increasingly dominated by the cost of the service call itself. We propose to eliminate these calls by building disk arrays that contain enough spare disks to operate without any human intervention during their whole lifetime. To evaluate the feasibility of this approach, we have simulated the behavior of two-dimensional disk arrays with n parity disks and n(n-1)/2 data disks under realistic failure and repair assumptions. Our conclusion is that having n(n+1)/2 spare disks is more than enough to achieve a 99.999 percent probability of not losing data over four years. We observe that the same objectives cannot be reached with RAID level 6 organizations and would require RAID stripes that could tolerate triple disk failures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed systems and fault tolerance · Caching and Content Delivery
