Blocked Bloom Filters with Choices
Johanna Elena Schmitz, Jens Zentgraf, Sven Rahmann

TL;DR
This paper introduces Blocked Bloom filters with choices, a new probabilistic data structure that reduces space usage or false positive rates compared to traditional Blocked Bloom filters, with practical bioinformatics applications.
Contribution
The paper proposes a novel variant of Blocked Bloom filters that incorporates multiple choices for key placement, improving space efficiency and false positive rates.
Findings
Uses less space at same false positive rate
Lower false positive rate at same space
Effective in bioinformatics workflows
Abstract
Probabilistic filters are approximate set membership data structures that represent a set of keys in small space, and answer set membership queries without false negative answers, but with a certain allowed false positive probability. Such filters are widely used in database systems, networks, storage systems and in biological sequence analysis because of their fast query times and low space requirements. Starting with Bloom filters in the 1970s, many filter data structures have been developed, each with its own advantages and disadvantages, e.g., Blocked Bloom filters, Cuckoo filters, XOR filters, Ribbon filters, and more. We introduce Blocked Bloom filters with choices that work similarly to Blocked Bloom filters, except that for each key there are two (or more) alternative choices of blocks where the key's information may be stored. The result is a filter that partially inherits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
