# Sampling and Reconstruction Using Bloom Filters

**Authors:** Neha Sengupta, Amitabha Bagchi, Srikanta Bedathur, Maya Ramanath

arXiv: 1701.03308 · 2019-05-15

## TL;DR

This paper introduces new algorithms and data structures for sampling and reconstructing sets stored as Bloom filters, addressing a previously unexplored problem with theoretical analysis and experimental validation.

## Contribution

It presents the BloomSampleTree data structure and HashInvert method, the first solutions for sampling and reconstructing sets from Bloom filters.

## Key findings

- BloomSampleTree enables efficient sampling and reconstruction.
- HashInvert offers a space-efficient reconstruction when hash functions are invertible.
- Experimental results confirm the efficiency and effectiveness of the proposed methods.

## Abstract

In this paper, we address the problem of sampling from a set and reconstructing a set stored as a Bloom filter. To the best of our knowledge our work is the first to address this question. We introduce a novel hierarchical data structure called BloomSampleTree that helps us design efficient algorithms to extract an almost uniform sample from the set stored in a Bloom filter and also allows us to reconstruct the set efficiently. In the case where the hash functions used in the Bloom filter implementation are partially invertible, in the sense that it is easy to calculate the set of elements that map to a particular hash value, we propose a second, more space-efficient method called HashInvert for the reconstruction. We study the properties of these two methods both analytically as well as experimentally. We provide bounds on run times for both methods and sample quality for the BloomSampleTree based algorithm, and show through an extensive experimental evaluation that our methods are efficient and effective.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1701.03308/full.md

## Figures

35 figures with captions in the complete paper: https://tomesphere.com/paper/1701.03308/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1701.03308/full.md

---
Source: https://tomesphere.com/paper/1701.03308