Heavy hitters via cluster-preserving clustering
Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen, Mikkel Thorup

TL;DR
This paper introduces ExpanderSketch, an optimal space and time algorithm for turnstile $ exttt{ell}_p$ heavy hitters, using a novel cluster-preserving clustering approach to efficiently identify heavy hitters in high-dimensional data.
Contribution
It presents a new algorithm that achieves optimal space and update time for heavy hitters, improving query efficiency by reducing the problem to cluster-preserving clustering.
Findings
Achieves $O( ext{ε}^{-p} ext{log} n)$ space and $O( ext{log} n)$ update time.
Provides a fast query time of $O( ext{ε}^{-p} ext{poly}( ext{log} n))$ with high probability.
Introduces a spectral clustering reduction for heavy hitters detection.
Abstract
In turnstile -heavy hitters, one maintains a high-dimensional subject to causing , where , . Upon receiving a query, the goal is to report a small list , , containing every "heavy hitter" with , where denotes the vector obtained by zeroing out the largest entries of in magnitude. For any the CountSketch solves heavy hitters using words of space with update time, query time to output , and whose output after any query is correct with high probability (whp) . Unfortunately the query time is very slow. To remedy this, the work [CM05]…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
