From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees
Yifan Dai, Yien Xu, Aishwarya Ganesan, Ramnatthan Alagappan, Brian, Kroth, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau

TL;DR
Bourbon introduces a machine learning-based LSM tree that enhances lookup speed by learning key distributions, achieving significant performance improvements over existing systems.
Contribution
It presents BOURBON, a novel LSM tree design that uses greedy piecewise linear regression and cost-benefit strategies for improved lookup efficiency.
Findings
Bourbon improves lookup performance by 1.23x to 1.78x.
It effectively learns key distributions using regression.
Experimental results validate its superiority over state-of-the-art LSMs.
Abstract
We introduce BOURBON, a log-structured merge (LSM) tree that utilizes machine learning to provide fast lookups. We base the design and implementation of BOURBON on empirically-grounded principles that we derive through careful analysis of LSM design. BOURBON employs greedy piecewise linear regression to learn key distributions, enabling fast lookup with minimal computation, and applies a cost-benefit strategy to decide when learning will be worthwhile. Through a series of experiments on both synthetic and real-world datasets, we show that BOURBON improves lookup performance by 1.23x-1.78x as compared to state-of-the-art production LSMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Time Series Analysis and Forecasting · Data Mining Algorithms and Applications
MethodsLinear Regression
