Blocked Bloom Filters with Choices

Johanna Elena Schmitz; Jens Zentgraf; Sven Rahmann

arXiv:2501.18977·cs.DB·August 14, 2025

Blocked Bloom Filters with Choices

Johanna Elena Schmitz, Jens Zentgraf, Sven Rahmann

PDF

TL;DR

This paper introduces Blocked Bloom filters with choices, a new probabilistic data structure that reduces space usage or false positive rates compared to traditional Blocked Bloom filters, with practical bioinformatics applications.

Contribution

The paper proposes a novel variant of Blocked Bloom filters that incorporates multiple choices for key placement, improving space efficiency and false positive rates.

Findings

01

Uses less space at same false positive rate

02

Lower false positive rate at same space

03

Effective in bioinformatics workflows

Abstract

Probabilistic filters are approximate set membership data structures that represent a set of keys in small space, and answer set membership queries without false negative answers, but with a certain allowed false positive probability. Such filters are widely used in database systems, networks, storage systems and in biological sequence analysis because of their fast query times and low space requirements. Starting with Bloom filters in the 1970s, many filter data structures have been developed, each with its own advantages and disadvantages, e.g., Blocked Bloom filters, Cuckoo filters, XOR filters, Ribbon filters, and more. We introduce Blocked Bloom filters with choices that work similarly to Blocked Bloom filters, except that for each key there are two (or more) alternative choices of blocks where the key's information may be stored. The result is a filter that partially inherits…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.