Evaluating Learned Indexes for External-Memory Joins

Yuvaraj Chesetti; Prashant Pandey

arXiv:2407.00590·cs.DB·May 26, 2025

Evaluating Learned Indexes for External-Memory Joins

Yuvaraj Chesetti, Prashant Pandey

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of learned indexes in external-memory join operations, comparing their performance and space efficiency against traditional methods across various datasets and conditions.

Contribution

It provides a comprehensive analysis of learned index-based joins for external-memory scenarios, highlighting their trade-offs and performance relative to traditional join algorithms.

Findings

01

Learned indexes can trade accuracy for space without significant performance loss.

02

They produce smaller indexes but have similar I/O costs as B-trees in external-memory joins.

03

Construction times for learned indexes are about 1000 times longer than traditional indexes.

Abstract

Joins are among the most time-consuming and data-intensive operations in relational query processing. Much research effort has been applied to the optimization of join processing due to their frequent execution. Recent studies have shown that CDF-based learned models can create smaller and faster indexes, accelerating in-memory joins. However, their effectiveness for external-memory joins, which are crucial for large-scale databases, remains underexplored. This paper evaluates the impact of learned indexes on external-memory joins for both sorted and unsorted data. We compare learned index-based joins against traditional join methods such as hash joins, sort joins, and indexed nested-loop joins on real-world and simulated datasets. Additionally, we analyze learned index-based joins across multiple dimensions, including storage device types, data sorting, parallelism, constrained memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies