Benchmarking Learned Indexes
Ryan Marcus, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit, Misra, Alfons Kemper, Thomas Neumann, Tim Kraska

TL;DR
This paper presents a comprehensive benchmark comparing learned index structures to traditional indexes across multiple real-world datasets, highlighting their performance advantages and analyzing factors affecting their efficiency.
Contribution
It introduces a unified benchmark for learned indexes, evaluates their performance against traditional methods, and explores factors influencing their effectiveness and properties.
Findings
Learned indexes outperform traditional indexes in read-only in-memory workloads.
Caching, pipelining, and dataset size significantly impact learned index performance.
Learned indexes show promising multi-threaded performance and efficient build times.
Abstract
Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art "traditional" baselines. Using four real-world datasets, we demonstrate that learned index structures can indeed outperform non-learned indexes in read-only in-memory workloads over a dense array. We also investigate the impact of caching, pipelining, dataset size, and key size. We study the performance profile of learned index structures, and build an explanation for why learned models achieve such good performance. Finally, we investigate other important properties of learned index structures, such as their performance in multi-threaded systems and their build times.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
