Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Hans-Peter Lehmann; Peter Sanders; Stefan Walzer; Jonatan Ziegler

arXiv:2502.05613·cs.DS·July 3, 2025

Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Hans-Peter Lehmann, Peter Sanders, Stefan Walzer, Jonatan Ziegler

PDF

2 Repos

TL;DR

This paper introduces a space-efficient algorithm for storing seeds in randomized algorithms, reducing entropy and improving minimal perfect hash function construction with significantly faster throughput.

Contribution

It presents a novel method to encode successful seeds more efficiently, lowering entropy and enabling near-optimal space usage in minimal perfect hashing.

Findings

01

Reduces seed storage entropy by Ω(n) bits in most cases.

02

Achieves (1+ε)OPT space with O(n/ε) construction time.

03

Outperforms state-of-the-art in construction throughput by over 100x for small ε.

Abstract

Randomised algorithms often employ methods that can fail and that are retried with independent randomness until they succeed. Randomised data structures therefore often store indices of successful attempts, called seeds. If $n$ such seeds are required (e.g., for independent substructures) the standard approach is to compute for each $i \in [n]$ the smallest successful seed $S_{i}$ and store $S = (S_{1}, \dots, S_{n})$ . The central observation of this paper is that this is not space-optimal. We present a different algorithm that computes a sequence $S^{'} = (S_{1}^{'}, \dots, S_{n}^{'})$ of successful seeds such that the entropy of $S^{'}$ undercuts the entropy of $S$ by $Ω (n)$ bits in most cases. To achieve a memory consumption of $OPT + ε n$ , the expected number of inspected seeds increases by a factor of $O (1/ ε)$ . We demonstrate the usefulness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.