Harnessing Correlations in Distributed Erasure Coded Key-Value Stores
Ramy E. Ali, Viveck Cadambe

TL;DR
This paper investigates how exploiting correlations between data versions in distributed erasure coded key-value stores can reduce storage costs while maintaining consistency, using novel coding schemes and information-theoretic bounds.
Contribution
It introduces new multi-version coding schemes that leverage data correlations to lower storage costs, and proves their near-optimality in certain regimes.
Findings
Correlated data versions allow for reduced storage costs.
Proposed codes outperform previous independent-version schemes.
Achieves near-optimality through information-theoretic bounds.
Abstract
Motivated by applications of distributed storage systems to cloud-based key-value stores, the multi-version coding problem has been recently formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by consistency requirements in distributed systems, the main goal in multi-version coding is to ensure that the latest possible version of the data is decodable, even if the data updates have not reached some servers in the system. In this paper, we study the storage cost of ensuring consistency for the case where the data versions are correlated, in contrast to previous work where data versions were treated as being independent. We provide multi-version code constructions that show that the storage cost can be significantly smaller than the previous constructions depending on the degree of correlation between the different versions of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
