Succinct Data Structures for Retrieval and Approximate Membership
Martin Dietzfelbinger, Rasmus Pagh

TL;DR
This paper introduces space-efficient data structures for retrieval and approximate membership that eliminate previous overheads, achieve near-optimal space usage, and have simple evaluation procedures, with theoretical guarantees on construction and query times.
Contribution
It presents novel succinct data structures for retrieval and approximate membership that remove space overheads and approach theoretical lower bounds, improving efficiency and simplicity.
Findings
Achieves retrieval with space within a factor 1+e^{-k} of optimal for query time O(k).
Eliminates space overhead in approximate membership structures, approaching lower bounds.
Provides expected O(n) construction time and simple evaluation procedures.
Abstract
The retrieval problem is the problem of associating data with keys in a set. Formally, the data structure must store a function f: U ->{0,1}^r that has specified values on the elements of a given set S, a subset of U, |S|=n, but may have any value on elements outside S. Minimal perfect hashing makes it possible to avoid storing the set S, but this induces a space overhead of Theta(n) bits in addition to the nr bits needed for function values. In this paper we show how to eliminate this overhead. Moreover, we show that for any k query time O(k) can be achieved using space that is within a factor 1+e^{-k} of optimal, asymptotically for large n. If we allow logarithmic evaluation time, the additive overhead can be reduced to O(log log n) bits whp. The time to construct the data structure is O(n), expected. A main technical ingredient is to utilize existing tight bounds on the probability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Advanced Image and Video Retrieval Techniques · Optimization and Search Problems
