Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and   Fast Rejection

Atsuki Sato; Yusuke Matsui

arXiv:2502.03696·cs.DS·February 7, 2025

Cascaded Learned Bloom Filter for Optimal Model-Filter Size Balance and Fast Rejection

Atsuki Sato, Yusuke Matsui

PDF

Open Access

TL;DR

This paper introduces the Cascaded Learned Bloom Filter (CLBF), a novel approach that optimizes the balance between model and filter sizes and significantly reduces reject time, enhancing memory efficiency and speed.

Contribution

The paper presents a dynamic programming-based method to automatically optimize the configuration of learned Bloom filters, addressing key limitations in existing approaches.

Findings

01

Reduces memory usage by up to 24%.

02

Decreases reject time by up to 14 times.

03

Outperforms state-of-the-art learned Bloom filters.

Abstract

Recent studies have demonstrated that learned Bloom filters, which combine machine learning with the classical Bloom filter, can achieve superior memory efficiency. However, existing learned Bloom filters face two critical unresolved challenges: the balance between the machine learning model size and the Bloom filter size is not optimal, and the reject time cannot be minimized effectively. We propose the Cascaded Learned Bloom Filter (CLBF) to address these issues. Our dynamic programming-based optimization automatically selects configurations that achieve an optimal balance between the model and filter sizes while minimizing reject time. Experiments on real-world datasets show that CLBF reduces memory usage by up to 24% and decreases reject time by up to 14 times compared to state-of-the-art learned Bloom filters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Machine Learning and ELM

MethodsBLOOM