On the Choice of General Purpose Classifiers in Learned Bloom Filters:   An Initial Analysis Within Basic Filters

Giacomo Fumagalli; Davide Raimondi; Raffaele Giancarlo; Dario; Malchiodi; Marco Frasca

arXiv:2112.06563·cs.LG·December 14, 2021

On the Choice of General Purpose Classifiers in Learned Bloom Filters: An Initial Analysis Within Basic Filters

Giacomo Fumagalli, Davide Raimondi, Raffaele Giancarlo, Dario, Malchiodi, Marco Frasca

PDF

Open Access

TL;DR

This paper investigates the impact of different classifiers on Learned Bloom Filters, providing initial guidelines for selecting the most suitable classifier among five classic paradigms to optimize performance.

Contribution

It offers the first systematic analysis of classifier choices in Learned Bloom Filters and proposes initial guidelines for classifier selection based on performance considerations.

Findings

01

Analyzed five classic classifiers for Learned Bloom Filters

02

Provided initial guidelines for classifier selection

03

Highlighted the impact of classifier choice on filter performance

Abstract

Bloom Filters are a fundamental and pervasive data structure. Within the growing area of Learned Data Structures, several Learned versions of Bloom Filters have been considered, yielding advantages over classic Filters. Each of them uses a classifier, which is the Learned part of the data structure. Although it has a central role in those new filters, and its space footprint as well as classification time may affect the performance of the Learned Filter, no systematic study of which specific classifier to use in which circumstances is available. We report progress in this area here, providing also initial guidelines on which classifier to choose among five classic classification paradigms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery