Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows
Vladimir Braverman, Elena Grigorescu, Harry Lang, David P. Woodruff,, Samson Zhou

TL;DR
This paper introduces a new composable histogram framework for efficiently approximating the number of distinct elements and heavy hitters in sliding window data streams, achieving near-optimal space complexity.
Contribution
The paper presents a novel composable histogram technique that improves space efficiency for sliding window algorithms for distinct elements and heavy hitters, with near-optimal bounds.
Findings
Space complexity for distinct elements is nearly optimal.
Improved space bounds for $oldsymbol{ ext{ell}_p}$-heavy hitters.
Established lower bounds matching the upper bounds.
Abstract
We study the distinct elements and -heavy hitters problems in the sliding window model, where only the most recent elements in the data stream form the underlying set. We first introduce the composable histogram, a simple twist on the exponential (Datar et al., SODA 2002) and smooth histograms (Braverman and Ostrovsky, FOCS 2007) that may be of independent interest. We then show that the composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and -heavy hitters that are nearly optimal in both and . Applying our new composable histogram framework, we provide an algorithm that outputs a -approximation to the number of distinct elements in the sliding window model and uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
