# Role of Bloom Filter in Big Data Research: A Survey

**Authors:** Ripon Patgiri, Sabuzima Nayak, Samir Kumar Borgohain

arXiv: 1903.06565 · 2019-03-18

## TL;DR

This survey explores the role of Bloom Filters in managing and optimizing large-scale, unstructured, and duplicate data in Big Data systems across various fields, highlighting their efficiency and versatility.

## Contribution

It provides a comprehensive overview of Bloom Filter applications in Big Data, emphasizing their importance in data filtering, memory optimization, and interdisciplinary research.

## Key findings

- Bloom Filters efficiently filter duplicates in large datasets.
- They significantly reduce memory usage in Big Data storage.
- Bloom Filters are adaptable to various fields like bioinformatics and government data.

## Abstract

Big Data is the most popular emerging trends that becomes a blessing for human kinds and it is the necessity of day-to-day life. For example, Facebook. Every person involves with producing data either directly or indirectly. Thus, Big Data is a high volume of data with exponential growth rate that consists of a variety of data. Big Data touches all fields, including Government sector, IT industry, Business, Economy, Engineering, Bioinformatics, and other basic sciences. Thus, Big Data forms a data silo. Most of the data are duplicates and unstructured. To deal with such kind of data silo, Bloom Filter is a precious resource to filter out the duplicate data. Also, Bloom Filter is inevitable in a Big Data storage system to optimize the memory consumption. Undoubtedly, Bloom Filter uses a tiny amount of memory space to filter a very large data size and it stores information of a large set of data. However, functionality of the Bloom Filter is limited to membership filter, but it can be adapted in various applications. Besides, the Bloom Filter is deployed in diverse field, and also used in the interdisciplinary research area. Bioinformatics, for instance. In this article, we expose the usefulness of Bloom Filter in Big Data research.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.06565/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1903.06565/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1903.06565/full.md

---
Source: https://tomesphere.com/paper/1903.06565