On the Costs and Benefits of Learned Indexing for Dynamic High-Dimensional Data: Extended Version
Ter\'ezia Slanin\'akov\'a, Jaroslav Olha, David Proch\'azka, Matej Antol, Vlastislav Dohnal

TL;DR
This paper investigates how to adapt learned indexes for dynamic, high-dimensional data by introducing dynamization techniques, cost models, and experimental analysis to determine when dynamic indexes outperform static ones as datasets grow.
Contribution
It presents a method for dynamizing static learned indexes for complex data and evaluates the cost-performance trade-offs through an amortized cost model.
Findings
Dynamized indexes scale better with growing datasets.
The cost model helps identify when dynamic indexes are preferable.
Experimental results show dynamic indexes outperform static ones over time.
Abstract
One of the main challenges within the growing research area of learned indexing is the lack of adaptability to dynamically expanding datasets. This paper explores the dynamization of a static learned index for complex data through operations such as node splitting and broadening, enabling efficient adaptation to new data. Furthermore, we evaluate the trade-offs between static and dynamic approaches by introducing an amortized cost model to assess query performance in tandem with the build costs of the index structure, enabling experimental determination of when a dynamic learned index outperforms its static counterpart. We apply the dynamization method to a static learned index and demonstrate that its superior scaling quickly surpasses the static implementation in terms of overall costs as the database grows. This is an extended version of the paper presented at DAWAK 2025.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Quality and Management · Data Management and Algorithms
