Distance Sensitive Bloom Filters Without False Negatives
Mayank Goswami, Rasmus Pagh, Francesco Silvestri, Johan Sivertsen

TL;DR
This paper introduces distance sensitive Bloom filters that avoid false negatives, providing tight bounds on space efficiency for set proximity queries in Hamming space, crucial for applications requiring high reliability.
Contribution
It presents the first false-negative-free distance sensitive Bloom filters with tight space bounds, advancing the reliability of proximity queries in high-dimensional data.
Findings
Established tight upper bounds on space for false-negative-free filters.
Derived lower bounds matching the upper bounds in several cases.
Demonstrated the feasibility of reliable proximity queries without false negatives.
Abstract
A Bloom filter is a widely used data-structure for representing a set and answering queries of the form "Is in ?". By allowing some false positive answers (saying "yes" when the answer is in fact `no') Bloom filters use space significantly below what is required for storing . In the distance sensitive setting we work with a set of (Hamming) vectors and seek a data structure that offers a similar trade-off, but answers queries of the form "Is close to an element of ?" (in Hamming distance). Previous work on distance sensitive Bloom filters have accepted false positive and false negative answers. Absence of false negatives is of critical importance in many applications of Bloom filters, so it is natural to ask if this can be also achieved in the distance sensitive setting. Our main contributions are upper and lower bounds (that are tight in several cases) for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
