Optimal Control of Storage Regeneration with Repair Codes
Francesco De Pellegrini, Rachid El Azouzi, Alonso Silva, and Olfa, Hassani

TL;DR
This paper develops an optimal control framework for storage regeneration using repair codes to ensure high availability of containerized applications, minimizing activation and transfer costs under fault conditions.
Contribution
It introduces a controlled fluid model and derives the optimal activation policy for repair, characterized by a threshold policy using Pontryagin's minimum principle.
Findings
Optimal activation policy is of threshold type.
The model guides system dimensioning and cost tradeoff analysis.
Feasibility conditions for repair are established.
Abstract
High availability of containerized applications requires to perform robust storage of applications' state. Since basic replication techniques are extremely costly at scale, storage space requirements can be reduced by means of erasure or repairing codes. In this paper we address storage regeneration using repair codes, a robust distributed storage technique with no need to fully restore the whole state in case of failure. In fact, only the lost servers' content is replaced. To do so, new cleanslate storage units are made operational at a cost for activating new storage servers and a cost for the transfer of repair data. Our goal is to guarantee maximal availability of containers' state files by a given deadline. activation of servers and communication cost. Upon a fault occurring at a subset of the storage servers, we aim at ensuring that they are repaired by a given deadline. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
