From Specific to Generic Learned Sorted Set Dictionaries: A Theoretically Sound Paradigm Yelding Competitive Data Structural Boosters in Practice
Domenico Amato, Giosu\'e Lo Bosco, Raffaele Giancarlo

TL;DR
This paper introduces a new paradigm for Learned Sorted Set Dictionaries that generalizes existing methods, enabling the creation of efficient, competitive data structures with theoretical guarantees and practical performance improvements.
Contribution
It presents a novel paradigm for Learned Sorted Set Dictionaries that can be applied to various data structures, providing theoretical bounds and practical efficiency.
Findings
First Learned Optimum Binary Search Forest with entropy-based access time
Learned Sorted Set Dictionary matching classic bounds in dynamic, amortized setting
Generalized approach yields effective, competitive data structural boosters
Abstract
This research concerns Learned Data Structures, a recent area that has emerged at the crossroad of Machine Learning and Classic Data Structures. It is methodologically important and with a high practical impact. We focus on Learned Indexes, i.e., Learned Sorted Set Dictionaries. The proposals available so far are specific in the sense that they can boost, indeed impressively, the time performance of Table Search Procedures with a sorted layout only, e.g., Binary Search. We propose a novel paradigm that, complementing known specialized ones, can produce Learned versions of any Sorted Set Dictionary, for instance, Balanced Binary Search Trees or Binary Search on layouts other that sorted, i.e., Eytzinger. Theoretically, based on it, we obtain several results of interest, such as (a) the first Learned Optimum Binary Search Forest, with mean access time bounded by the Entropy of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Text and Document Classification Technologies · Image Retrieval and Classification Techniques
MethodsFocus
