Modeling and Optimization of Latency in Erasure-coded Storage Systems
Vaneet Aggarwal, Tian Lan

TL;DR
This paper reviews recent theoretical and practical advances in modeling and optimizing access latency in erasure-coded distributed storage systems, addressing challenges in cloud and edge environments.
Contribution
It provides a comprehensive overview of latency modeling approaches, including scheduling policies and real-world implementation lessons, highlighting key challenges and open problems.
Findings
Analysis of scheduling policies like MDS-Reservation and Fork-Join.
Characterization of latency metrics such as mean and tail latency.
Insights from prototype implementations and real-world applications.
Abstract
As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage. In this monograph, we will first identify the key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Advanced Data Storage Technologies · Privacy-Preserving Technologies in Data
