Don't Thrash: How to Cache Your Hash on Flash
Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner,, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard, P. Spillane, Erez Zadok

TL;DR
This paper introduces the quotient filter and two SSD-optimized data structures, the buffered quotient filter and cascade filter, which outperform Bloom filters in space efficiency, data locality, and I/O performance for large-scale set membership queries.
Contribution
It presents the quotient filter as a superior alternative to Bloom filters, supporting deletions and dynamic resizing, and introduces SSD-optimized variants that significantly improve performance.
Findings
Cascade filter and buffered quotient filter outperform Bloom filters in insertion speed (8.6-11x)
Both data structures are more efficient in lookups (0.94-2.56x faster)
Quotient filter supports deletions and dynamic resizing, unlike Bloom filters.
Abstract
This paper presents new alternatives to the well-known Bloom filter data structure. The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, storage systems, and networks. Because the Bloom filter performs frequent random reads and writes, it is used almost exclusively in RAM, limiting the size of the sets it can represent. This paper first describes the quotient filter, which supports the basic operations of the Bloom filter, achieving roughly comparable performance in terms of space and time, but with better data locality. Operations on the quotient filter require only a small number of contiguous accesses. The quotient filter has other advantages over the Bloom filter: it supports deletions, it can be dynamically resized, and two quotient filters can be efficiently merged. The paper then gives two data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Advanced Data Storage Technologies · Peer-to-Peer Network Technologies
