Distributed Storage Allocations
Derek Leong, Alexandros G. Dimakis, Tracey Ho

TL;DR
This paper investigates how to optimally allocate storage in distributed systems to maximize data recovery probability, revealing complex optimal strategies and when coding improves reliability.
Contribution
It provides the first comprehensive analysis of optimal storage allocations under various models, including conditions where coding is advantageous.
Findings
Optimal allocations often have nonintuitive structures.
Symmetric allocations are optimal in certain cases.
Coding's benefit depends on specific storage and access models.
Abstract
We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store any amount of coded data in each node, as long as the total amount of storage used does not exceed the given budget. A data collector subsequently attempts to recover the original data object by accessing only the data stored in a random subset of the nodes. By using an appropriate code, successful recovery can be achieved whenever the total amount of data accessed is at least the size of the original data object. The goal is to find an optimal storage allocation that maximizes the probability of successful recovery. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
