Bloom filter variants for multiple sets: a comparative assessment
Luca Calderoni, Dario Maio, Paolo Palmieri

TL;DR
This paper compares two Bloom filter variants, ShBF and SBF, for storing multiple sets, analyzing their false positive rates, inter-set errors, and efficiency, with ShBF being more space-efficient but computationally costlier.
Contribution
The paper extends the shifting Bloom filter's functionality for multiple subsets, introduces a generalized ShBF definition, and provides a comprehensive performance comparison with the spatial Bloom filter.
Findings
ShBF offers better space efficiency than SBF.
SBF has lower computational cost.
Extended ShBF performs well for multiple subsets.
Abstract
In this paper we compare two probabilistic data structures for association queries derived from the well-known Bloom filter: the shifting Bloom filter (ShBF), and the spatial Bloom filter (SBF). With respect to the original data structure, both variants add the ability to store multiple subsets in the same filter, using different strategies. We analyse the performance of the two data structures with respect to false positive probability, and the inter-set error probability (the probability for an element in the set of being recognised as belonging to the wrong subset). As part of our analysis, we extended the functionality of the shifting Bloom filter, optimising the filter for any non-trivial number of subsets. We propose a new generalised ShBF definition with applications outside of our specific domain, and present new probability formulas. Results of the comparison show that the ShBF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
