ShockHash: Towards Optimal-Space Minimal Perfect Hashing Beyond Brute-Force
Hans-Peter Lehmann, Peter Sanders, Stefan Walzer

TL;DR
ShockHash introduces a novel approach using overloaded cuckoo hash tables and pseudoforest detection to construct minimal perfect hash functions more efficiently, significantly reducing seed testing and space requirements.
Contribution
The paper presents ShockHash, a new method that improves MPHF construction by reducing seed testing complexity and space usage through pseudoforest detection in random graphs.
Findings
Reduces seed testing from exponential to roughly (e/2)^n
Achieves near-optimal space with significantly fewer bits
Outperforms previous methods by two orders of magnitude in efficiency
Abstract
A minimal perfect hash function (MPHF) maps a set of keys to the first integers without collisions. There is a lower bound of bits of space needed to represent an MPHF. A matching upper bound is obtained using the brute-force algorithm that tries random hash functions until stumbling on an MPHF and stores that function's seed. In expectation, seeds need to be tested. The most space-efficient previous algorithms for constructing MPHFs all use such a brute-force approach as a basic building block. In this paper, we introduce ShockHash - Small, heavily overloaded cuckoo hash tables. ShockHash uses two hash functions and , hoping for the existence of a function such that is an MPHF on . In graph terminology, ShockHash generates -edge random graphs until stumbling on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Algorithms and Data Compression · Advanced Image and Video Retrieval Techniques
