Learned Static Function Data Structures
Stefan Hermann, Hans-Peter Lehmann, Giorgio Vinciguerra, Stefan Walzer

TL;DR
This paper introduces learned static functions that leverage machine learning to predict value distributions for keys, enabling significant space savings over traditional static function data structures while supporting point queries.
Contribution
It presents a novel approach combining machine learning with static function data structures to surpass zero-order entropy limits in space efficiency.
Findings
Achieves up to tenfold space reduction on real data
Attains up to thousandfold space savings on synthetic data
Supports point queries while breaking entropy barriers
Abstract
We consider the task of constructing a data structure for associating a static set of keys with values, while allowing arbitrary output values for queries involving keys outside the set. Compared to hash tables, these so-called static function data structures do not need to store the key set and thus use significantly less memory. Several techniques are known, with compressed static functions approaching the zero-order empirical entropy of the value sequence. In this paper, we introduce learned static functions, which use machine learning to capture correlations between keys and values. For each key, a model predicts a probability distribution over the values, from which we derive a key-specific prefix code to compactly encode the true value. The resulting codeword is stored in a classic static function data structure. This design allows learned static functions to break the zero-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Algorithms and Data Compression · Time Series Analysis and Forecasting
