TL;DR
Proteus is a self-configuring range filter that optimizes false positive rates using a formal model, improving performance and robustness across diverse workloads in practical database systems.
Contribution
It introduces Proteus, a self-designing range filter that unifies probabilistic and deterministic approaches with a formal FPR model for better adaptability.
Findings
Proteus achieves up to 5.3x performance improvement in RocksDB.
The CPFPR model accurately predicts filter false positive rates.
Proteus is robust to workload shifts and incurs minimal modeling costs.
Abstract
We introduce Proteus, a novel self-designing approximate range filter, which configures itself based on sampled data in order to optimize its false positive rate (FPR) for a given space requirement. Proteus unifies the probabilistic and deterministic design spaces of state-of-the-art range filters to achieve robust performance across a larger variety of use cases. At the core of Proteus lies our Contextual Prefix FPR (CPFPR) model - a formal framework for the FPR of prefix-based filters across their design spaces. We empirically demonstrate the accuracy of our model and Proteus' ability to optimize over both synthetic workloads and real-world datasets. We further evaluate Proteus in RocksDB and show that it is able to improve end-to-end performance by as much as 5.3x over more brittle state-of-the-art methods such as SuRF and Rosetta. Our experiments also indicate that the cost of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
