TL;DR
This paper introduces a space-efficient algorithm for storing seeds in randomized algorithms, reducing entropy and improving minimal perfect hash function construction with significantly faster throughput.
Contribution
It presents a novel method to encode successful seeds more efficiently, lowering entropy and enabling near-optimal space usage in minimal perfect hashing.
Findings
Reduces seed storage entropy by Ω(n) bits in most cases.
Achieves (1+ε)OPT space with O(n/ε) construction time.
Outperforms state-of-the-art in construction throughput by over 100x for small ε.
Abstract
Randomised algorithms often employ methods that can fail and that are retried with independent randomness until they succeed. Randomised data structures therefore often store indices of successful attempts, called seeds. If such seeds are required (e.g., for independent substructures) the standard approach is to compute for each the smallest successful seed and store . The central observation of this paper is that this is not space-optimal. We present a different algorithm that computes a sequence of successful seeds such that the entropy of undercuts the entropy of by bits in most cases. To achieve a memory consumption of , the expected number of inspected seeds increases by a factor of . We demonstrate the usefulness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
