The Power of Simple Tabulation Hashing

Mihai Patrascu; Mikkel Thorup

arXiv:1011.5200·cs.DS·May 10, 2011·1 cites

The Power of Simple Tabulation Hashing

Mihai Patrascu, Mikkel Thorup

PDF

Open Access

TL;DR

This paper demonstrates that simple tabulation hashing, despite its simplicity and low independence, provides strong probabilistic guarantees similar to more complex hash functions, making it practical for randomized algorithms.

Contribution

It proves that simple tabulation hashing offers strong theoretical guarantees like Chernoff bounds and min-wise hashing, challenging the assumption that higher independence is necessary.

Findings

01

Provides Chernoff-type concentration bounds

02

Achieves min-wise hashing guarantees

03

Supports cuckoo hashing with simple tabulation

Abstract

Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Carter and Wegman (STOC'77). Keys are viewed as consisting of c characters. We initialize c tables T_1, ..., T_c mapping characters to random hash codes. A key x=(x_1, ..., x_q) is hashed to T_1[x_1] xor ... xor T_c[x_c]. While this scheme is not even 4-independent, we show that it provides many of the guarantees that are normally obtained via higher independence, e.g., Chernoff-type concentration, min-wise hashing for estimating set intersection, and cuckoo hashing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Advanced Image and Video Retrieval Techniques · DNA and Biological Computing