DPBloomfilter: Securing Bloom Filters with Differential Privacy
Yekun Ke, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Jiahao Zhang

TL;DR
This paper introduces DPBloomfilter, a novel privacy-preserving Bloom filter that integrates differential privacy, specifically Random Response, maintaining efficiency and utility while protecting sensitive data in large datasets.
Contribution
The paper presents the first differential privacy guarantee for Bloom filters, combining classical privacy mechanisms with the data structure without increasing complexity.
Findings
Maintains high utility with privacy guarantees
Operates with same complexity as standard Bloom filter
First to provide differential privacy for Bloom filters
Abstract
The Bloom filter is a simple yet space-efficient probabilistic data structure that supports membership queries for dramatically large datasets. It is widely utilized and implemented across various industrial scenarios, often handling massive datasets that include sensitive user information necessitating privacy preservation. To address the challenge of maintaining privacy within the Bloom filter, we have developed the DPBloomfilter. This innovation integrates the classical differential privacy mechanism, specifically the Random Response technique, into the Bloom filter, offering robust privacy guarantees under the same running complexity as the standard Bloom filter. Through rigorous simulation experiments, we have demonstrated that our DPBloomfilter algorithm maintains high utility while ensuring privacy protections. To the best of our knowledge, this is the first work to provide…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
-The paper is well-written and the authors have clearly reviewed the related work on Bloom filters and their wide range of use. The background notions on Bloom filters and differential privacy are also clearly summarized. -The authors have provided a detailed theoretical analysis of the accuracy of the mechanism.
-The approach has not been validated experimentally or compared to other state-of-the-art approaches such as BLIP. Thus, the theoretical analysis has not been validated experimentally. -There is clear tension between maintaining the accuracy of membership queries vs protecting the privacy of elements that have been inserted in the Bloom filter. In particular, if the accuracy of membership queries is high this means that an adversary can perform a reconstruction attack simply by enumerating pote
1. The paper introduces a new problem, privacy protection in sketches. This is a creative combination of existing ideas and has great connection with reality. 2. The paper presents a very detailed mathematical analysis of the theoretical properties of the algorithm.
1. The algorithm presented in the paper is too simple and has important defects. The original Bloom Filter produces only negative errors, but this algorithm also produced positive errors, which is a big problem. 2. There is no experimental results, which makes me doubt the claim that the algorithm "can still maintain good utility".
The problem is natural, and a bloom filter is an intuitive direction. Improving the privacy analysis by directly analyzing the filter's output distribution instead of using basic composition could also be a nice step.
1) The paper doesn't provide any evidence for the quality of its solution. Possible evidence would be experimental comparison to existing methods (see, e.g., Patel et al 24, https://openreview.net/pdf?id=GQNvvQquO0 for one method, as well as others that it discusses) or a comparison of formal utility guarantees (which the aforementioned paper also has). Briefly, what does this method provide that the others don't? 2) The paper lacks a clear (or even, technically, correct -- see Questions below)
The paper is reasonably well-written, and it provides a mathematical analysis of the randomized response composition. It applies a theoretical framework that is consistent and technically sound.
The paper does the most basic thing one might expect for adding differential privacy to a Bloom filter, perturbing bits independently via randomized response. The analysis follows straightforwardly from standard DP composition and doesn’t introduce conceptual or algorithmic novelty beyond that. Consequently, while technically correct, the work is not particularly original or deep. Indeed, as the authors note, this idea was examined in the BLIP paper from 2012, and I see the difference between th
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Cooperative Communication and Network Coding · Internet Traffic Analysis and Secure E-voting
