TL;DR
RecSplit introduces a novel minimal perfect hash function construction method that achieves near-optimal space efficiency, fast construction, and constant lookup time, outperforming existing techniques in multiple aspects.
Contribution
The paper presents a new recursive splitting technique for minimal perfect hashing that reduces space usage close to the theoretical lower bound while maintaining fast construction and query times.
Findings
Achieves 1.56 bits per key, close to the 1.44 bits lower bound.
Construction time is less than 2 milliseconds per key.
Outperforms state-of-the-art data structures in space, time, and efficiency.
Abstract
A minimal perfect hash function bijectively maps a key set out of a universe into the first natural numbers. Minimal perfect hash functions are used, for example, to map irregularly-shaped keys, such as string, in a compact space so that metadata can then be simply stored in an array. While it is known that just bits per key are necessary to store a minimal perfect function, no published technique can go below bits per key in practice. We propose a new technique for storing minimal perfect hash functions with expected linear construction time and expected constant lookup time that makes it possible to build for the first time, for example, structures which need bits per key, that is, within % of the lower bound, in less than ms per key. We show that instances of our construction are able to simultaneously beat the construction time, space usage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
