Improved Bounds and Schemes for the Declustering Problem
Benjamin Doerr, Nils Hebbinghaus, S\"oren Werth

TL;DR
This paper presents improved bounds and new declustering schemes for distributing data across storage devices, utilizing discrepancy theory to achieve near-optimal evenness in data allocation for higher-dimensional data.
Contribution
It introduces a declustering scheme with an additive error bound independent of data size, extending applicability to various dimensions and storage configurations.
Findings
Achieves an additive error of O_d(log^{d-1} M) for declustering schemes.
Corrects a previous proof error in lower bound estimation.
Establishes a tight lower bound for the declustering problem.
Abstract
The declustering problem is to allocate given data on parallel working storage devices in such a manner that typical requests find their data evenly distributed on the devices. Using deep results from discrepancy theory, we improve previous work of several authors concerning range queries to higher-dimensional data. We give a declustering scheme with an additive error of independent of the data size, where is the dimension, the number of storage devices and does not exceed the smallest prime power in the canonical decomposition of into prime powers. In particular, our schemes work for arbitrary in dimensions two and three. For general , they work for all that are powers of two. Concerning lower bounds, we show that a recent proof of a bound contains an error. We close the gap in the proof and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed systems and fault tolerance · Cryptography and Data Security
