BLI: A High-performance Bucket-based Learned Index with Concurrency   Support

Huibing Dong; Wenlong Wang; Chun Liu; and David Du

arXiv:2502.10597·cs.DB·February 18, 2025

BLI: A High-performance Bucket-based Learned Index with Concurrency Support

Huibing Dong, Wenlong Wang, Chun Liu, and David Du

PDF

Open Access

TL;DR

The paper introduces BLI, a high-performance, bucket-based learned index that improves insertion throughput, supports lock-free concurrency, and balances key performance metrics, outperforming existing learned indexes significantly.

Contribution

BLI is a novel bucket-based learned index that adopts a 'globally sorted, locally unsorted' approach, enabling efficient updates and lock-free concurrency.

Findings

01

BLI achieves up to 2.21x higher throughput than state-of-the-art learned indexes.

02

BLI attains up to 3.91x performance gains under multi-threaded workloads.

03

BLI effectively balances lookup latency, insertion latency, and memory consumption.

Abstract

Learned indexes are promising to replace traditional tree-based indexes. They typically employ machine learning models to efficiently predict target positions in strictly sorted linear arrays. However, the strict sorted order 1) significantly increases insertion overhead, 2) makes it challenging to support lock-free concurrency, and 3) harms in-node lookup/insertion efficiency due to model inaccuracy.\ In this paper, we introduce a \textit{Bucket-based Learned Index (BLI)}, which is an updatable in-memory learned index that adopts a "globally sorted, locally unsorted" approach by replacing linear sorted arrays with \textit{Buckets}. BLI optimizes the insertion throughput by only sorting Buckets, not the key-value pairs within a Bucket. BLI strategically balances three critical performance metrics: tree fanouts, lookup/insert latency for inner nodes, lookup/insert latency for leaf…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization