Multiple Set Matching and Pre-Filtering with Bloom Multifilters
Francesco Concas, Pengfei Xu, Mohammad A. Hoque, Jiaheng Lu, and Sasu, Tarkoma

TL;DR
This paper introduces two novel Bloom Multifilters, Bloom Matrix and Bloom Vector, for efficient multiple set matching, optimizing space and query performance based on data distribution, with a new testing method for selecting the best structure.
Contribution
Proposes Bloom Matrix and Bloom Vector, new probabilistic data structures for multiple set matching, with adaptive space optimization and a Bloom Test for structure selection.
Findings
Bloom Vector exploits Zipf distribution for space savings.
Bloom Matrix offers faster ADD and LOOKUP operations.
Bloom Matrix's false positive rate is less than 10^{-2} only under uniform distribution.
Abstract
Bloom filter is a space-efficient probabilistic data structure for checking elements' membership in a set. Given multiple sets, however, a standard Bloom filter is not sufficient when looking for the items to which an element or a set of input elements belong to. In this article, we solve multiple set matching problem by proposing two efficient Bloom Multifilters called Bloom Matrix and Bloom Vector. Both of them are space efficient and answer queries with a set of identifiers for multiple set matching problems. We show that the space efficiency can be optimized further according to the distribution of labels among multiple sets: Uniform and Zipf. While both of them are space efficient, Bloom Vector can efficiently exploit Zipf distribution of data for further space reduction. Our results also highlight that basic ADD and LOOKUP operations on Bloom Matrix are faster than on Bloom…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Internet Traffic Analysis and Secure E-voting · Peer-to-Peer Network Technologies
