Linear Hashing Is Optimal

Michael Jaber; Vinayak M. Kumar; David Zuckerman

arXiv:2505.14061·cs.DS·May 21, 2025

Linear Hashing Is Optimal

Michael Jaber, Vinayak M. Kumar, David Zuckerman

PDF

Open Access

TL;DR

This paper proves that linear hashing using a random matrix over GF(2) achieves an optimal maximum load comparable to fully random hashing, resolving a longstanding open problem in the field.

Contribution

It establishes that linear hashing is asymptotically optimal in terms of maximum load, matching the performance of fully random functions.

Findings

01

Expected maximum load is O(log n / log log n)

02

Maximum load exceeds r·log n / log log n with probability at most O(1/r^2)

03

Resolves an open question from prior research

Abstract

We prove that hashing $n$ balls into $n$ bins via a random matrix over $F_{2}$ yields expected maximum load $O (lo g n / lo g lo g n)$ . This matches the expected maximum load of a fully random function and resolves an open question posed by Alon, Dietzfelbinger, Miltersen, Petrank, and Tardos (STOC '97, JACM '99). More generally, we show that the maximum load exceeds $r \cdot lo g n / lo g lo g n$ with probability at most $O (1/ r^{2})$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques