Why Are Learned Indexes So Effective but Sometimes Ineffective?
Qiyu Liu, Siyuan Han, Yanlin Qi, Jingshu Peng, Jin Li, Longlong Lin,, Lei Chen

TL;DR
This paper analyzes the theoretical efficiency and practical limitations of learned indexes, especially PGM-Index, and proposes PGM++ to significantly improve lookup performance while maintaining space efficiency.
Contribution
It provides the tightest theoretical bounds for PGM-Index and introduces PGM++, a new extension that enhances practical performance through hybrid search strategies.
Findings
PGM-Index can achieve O(log log N) lookup time with high probability.
Querying PGM-Indexes is highly memory-bound, limiting performance.
PGM++ outperforms original PGM-Index and state-of-the-art learned indexes in speed, with up to 2.31x faster queries.
Abstract
Learned indexes have attracted significant research interest due to their ability to offer better space-time trade-offs compared to traditional B+-tree variants. Among various learned indexes, the PGM-Index based on error-bounded piecewise linear approximation is an elegant data structure that has demonstrated \emph{provably} superior performance over conventional B+-tree indexes. In this paper, we explore two interesting research questions regarding the PGM-Index: (a) \emph{Why are PGM-Indexes theoretically effective?} and (b) \emph{Why do PGM-Indexes underperform in practice?} For question~(a), we first prove that, for a set of sorted keys, the PGM-Index can, with high probability, achieve a lookup time of while using space. To the best of our knowledge, this is the \textbf{tightest bound} for learned indexes to date. For question~(b), we identify that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistics Education and Methodologies · Advanced Text Analysis Techniques · Educational Assessment and Pedagogy
