BLI: A High-performance Bucket-based Learned Index with Concurrency Support
Huibing Dong, Wenlong Wang, Chun Liu, and David Du

TL;DR
The paper introduces BLI, a high-performance, bucket-based learned index that improves insertion throughput, supports lock-free concurrency, and balances key performance metrics, outperforming existing learned indexes significantly.
Contribution
BLI is a novel bucket-based learned index that adopts a 'globally sorted, locally unsorted' approach, enabling efficient updates and lock-free concurrency.
Findings
BLI achieves up to 2.21x higher throughput than state-of-the-art learned indexes.
BLI attains up to 3.91x performance gains under multi-threaded workloads.
BLI effectively balances lookup latency, insertion latency, and memory consumption.
Abstract
Learned indexes are promising to replace traditional tree-based indexes. They typically employ machine learning models to efficiently predict target positions in strictly sorted linear arrays. However, the strict sorted order 1) significantly increases insertion overhead, 2) makes it challenging to support lock-free concurrency, and 3) harms in-node lookup/insertion efficiency due to model inaccuracy.\ In this paper, we introduce a \textit{Bucket-based Learned Index (BLI)}, which is an updatable in-memory learned index that adopts a "globally sorted, locally unsorted" approach by replacing linear sorted arrays with \textit{Buckets}. BLI optimizes the insertion throughput by only sorting Buckets, not the key-value pairs within a Bucket. BLI strategically balances three critical performance metrics: tree fanouts, lookup/insert latency for inner nodes, lookup/insert latency for leaf…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization
