Optimal streaming and tracking distinct elements with high probability
Jaros{\l}aw B{\l}asiok

TL;DR
This paper improves the space complexity bounds for streaming algorithms estimating the number of distinct elements, removing unnecessary multiplicative factors and providing optimal solutions for both standard and strong tracking variants.
Contribution
It presents an optimal streaming algorithm for distinct elements that avoids space blow-up from multiple repetitions and establishes tight bounds for strong tracking.
Findings
Optimal space complexity for approximate distinct count without multiplicative blow-up.
Space bounds for strong tracking variant are proven to be optimal.
The results settle the space complexity for all standard parameters.
Abstract
The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range , we wish to provide a approximation to the number of distinct elements in the input. After a long line of research an optimal solution for this problem with constant probability of success, using bits of space, was given by Kane, Nelson and Woodruff in 2010. The standard approach used in order to achieve low failure probability is to take the median of parallel repetitions of the original algorithm. We show that such a multiplicative space blow-up is unnecessary: we provide an optimal algorithm using bits of space --- matching known lower bounds for this problem. That is, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
