AMS Without 4-Wise Independence on Product Domains
Vladimir Braverman, Kai-Min Chung, Zhenming Liu, Michael Mitzenmacher,, Rafail Ostrovsky

TL;DR
This paper extends sketching techniques based on 4-wise independence to product domains, enabling efficient one-pass approximation of joint distributions and independence testing for high-dimensional data streams.
Contribution
It generalizes existing methods from pairs to k-ary vectors, providing a space-efficient randomized algorithm for independence approximation in data streams.
Findings
Achieves a $(1 \u00b1 \u03b5)$ approximation with high probability.
Uses space logarithmic in domain sizes and exponential in dimension.
Extends previous work from 2-dimensional to k-dimensional data.
Abstract
In their seminal work, Alon, Matias, and Szegedy introduced several sketching techniques, including showing that 4-wise independence is sufficient to obtain good approximations of the second frequency moment. In this work, we show that their sketching technique can be extended to product domains by using the product of 4-wise independent functions on . Our work extends that of Indyk and McGregor, who showed the result for . Their primary motivation was the problem of identifying correlations in data streams. In their model, a stream of pairs arrive, giving a joint distribution , and they find approximation algorithms for how close the joint distribution is to the product of the marginal distributions under various metrics, which naturally corresponds to how close and are to being independent. By using our technique, we obtain a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Advanced Database Systems and Queries · Complexity and Algorithms in Graphs
