Tight Bounds for Hashing Block Sources
Kai-Min Chung, Salil Vadhan

TL;DR
This paper improves the theoretical bounds on the amount of min-entropy needed for 2-universal hash functions to produce nearly uniform outputs from block sources, optimizing the dependence on the number of items and impacting hashing algorithms.
Contribution
It provides tight bounds on min-entropy requirements for hashing block sources, reducing the dependence from 2 log T to log T, which is shown to be optimal.
Findings
Reduced min-entropy dependence from 2 log T to log T.
Proved optimality of the new bounds.
Enhanced analysis of hashing-based algorithms.
Abstract
It is known that if a 2-universal hash function is applied to elements of a {\em block source} , where each item has enough min-entropy conditioned on the previous items, then the output distribution will be ``close'' to the uniform distribution. We provide improved bounds on how much min-entropy per item is required for this to hold, both when we ask that the output be close to uniform in statistical distance and when we only ask that it be statistically close to a distribution with small collision probability. In both cases, we reduce the dependence of the min-entropy on the number of items from in previous work to , which we show to be optimal. This leads to corresponding improvements to the recent results of Mitzenmacher and Vadhan (SODA `08) on the analysis of hashing-based algorithms and data structures when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Image and Video Retrieval Techniques · Complexity and Algorithms in Graphs
