Coded Data Rebalancing for Decentralized Distributed Databases
K V Sushena Sree, Prasad Krishnan

TL;DR
This paper introduces coded data rebalancing schemes for decentralized distributed databases with random data placement, effectively correcting data skew and replication factor changes due to node dynamics while minimizing communication overhead.
Contribution
It proposes optimal asymptotic rebalancing schemes for decentralized databases with random placement, addressing data skew and replication factor reduction during node changes.
Findings
Proposed rebalancing schemes are asymptotically optimal.
Schemes effectively handle node additions and removals.
Reduces communication load during data rebalancing.
Abstract
The performance of replication-based distributed databases is affected due to non-uniform storage across storage nodes (also called \textit{data skew}) and reduction in the replication factor during operation, particularly due to node additions or removals. Data rebalancing refers to the communication involved between the nodes in correcting this data skew, while maintaining the replication factor. For carefully designed distributed databases, transmitting coded symbols during the rebalancing phase has been recently shown to reduce the communication load of rebalancing. In this work, we look at balanced distributed databases with \textit{random placement}, in which each data segment is stored in a random subset of nodes in the system, where refers to the replication factor of the distributed database. We call these as decentralized databases. For a natural class of such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
