One stone, two birds: A lightweight multidimensional learned index with cardinality support
Yingze Li, Hongzhi Wang, Xianglong Liu

TL;DR
This paper introduces CardIndex, a compact, multidimensional learned index that simultaneously performs efficient data indexing and cardinality estimation, reducing resource usage and latency in AI-driven database kernels.
Contribution
The paper presents CardIndex, a lightweight, multi-dimensional learned index that integrates cardinality estimation, achieving low latency and resource efficiency for database query processing.
Findings
CardIndex achieves 1-10 microseconds latency for index and cardinality tasks.
It reduces storage overhead by avoiding redundant data distribution parameters.
The hybrid estimation algorithm provides fast, exact results for low selectivity queries.
Abstract
Innovative learning based structures have recently been proposed to tackle index and cardinality estimation tasks, specifically learned indexes and data driven cardinality estimators. These structures exhibit excellent performance in capturing data distribution, making them promising for integration into AI driven database kernels. However, accurate estimation for corner case queries requires a large number of network parameters, resulting in higher computing resources on expensive GPUs and more storage overhead. Additionally, the separate implementation for CE and learned index result in a redundancy waste by storage of single table distribution twice. These present challenges for designing AI driven database kernels. As in real database scenarios, a compact kernel is necessary to process queries within a limited storage and time budget. Directly integrating these two AI approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Metabolomics and Mass Spectrometry Studies · Bayesian Modeling and Causal Inference
