Learned Static Function Data Structures

Stefan Hermann; Hans-Peter Lehmann; Giorgio Vinciguerra; Stefan Walzer

arXiv:2510.27588·cs.DS·May 20, 2026

Learned Static Function Data Structures

Stefan Hermann, Hans-Peter Lehmann, Giorgio Vinciguerra, Stefan Walzer

PDF

TL;DR

This paper introduces learned static functions that leverage machine learning to predict value distributions for keys, enabling significant space savings over traditional static function data structures while supporting point queries.

Contribution

It presents a novel approach combining machine learning with static function data structures to surpass zero-order entropy limits in space efficiency.

Findings

01

Achieves up to tenfold space reduction on real data

02

Attains up to thousandfold space savings on synthetic data

03

Supports point queries while breaking entropy barriers

Abstract

We consider the task of constructing a data structure for associating a static set of keys with values, while allowing arbitrary output values for queries involving keys outside the set. Compared to hash tables, these so-called static function data structures do not need to store the key set and thus use significantly less memory. Several techniques are known, with compressed static functions approaching the zero-order empirical entropy of the value sequence. In this paper, we introduce learned static functions, which use machine learning to capture correlations between keys and values. For each key, a model predicts a probability distribution over the values, from which we derive a key-specific prefix code to compactly encode the true value. The resulting codeword is stored in a classic static function data structure. This design allows learned static functions to break the zero-order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Algorithms and Data Compression · Time Series Analysis and Forecasting