Strongly universal string hashing is fast

Owen Kaser; Daniel Lemire

arXiv:1202.4961·cs.DB·September 24, 2018

Strongly universal string hashing is fast

Owen Kaser, Daniel Lemire

PDF

4 Repos

TL;DR

This paper introduces fast strongly universal string hashing families that outperform some popular hash functions in speed, supported by experimental results and theoretical proofs of their strong universality.

Contribution

The paper presents new strongly universal hash families that are faster than existing options and provides both experimental validation and accessible proofs of their properties.

Findings

01

Hash functions process data at 0.2 CPU cycle per byte.

02

These hash families outperform popular weaker-guarantee hash functions.

03

Experimental results include low-powered processors and CLMUL instruction set.

Abstract

We present fast strongly universal string hashing families: they can process data at a rate of 0.2 CPU cycle per byte. Maybe surprisingly, we find that these families---though they require a large buffer of random numbers---are often faster than popular hash functions with weaker theoretical guarantees. Moreover, conventional wisdom is that hash functions with fewer multiplications are faster. Yet we find that they may fail to be faster due to operation pipelining. We present experimental results on several processors including low-powered processors. Our tests include hash functions designed for processors with the Carry-Less Multiplication (CLMUL) instruction set. We also prove, using accessible proofs, the strong universality of our families.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.