Query complexity of heavy hitter estimation
Sahasrajit Sarmasarkar, Kota Srinivas Reddy, and Nikhil Karamchandani

TL;DR
This paper investigates the query complexity for identifying heavy hitters in a distribution using active queries, proposing algorithms and bounds for different query models, including noisy scenarios.
Contribution
It introduces new sequential algorithms and bounds for heavy hitter estimation under two query models, including robustness to noise.
Findings
Upper bounds on query complexity for both models.
Lower bounds establishing optimality of algorithms.
Robust estimators effective under noisy responses.
Abstract
We consider the problem of identifying the subset of elements in the support of an underlying distribution whose probability value is larger than a given threshold , by actively querying an oracle to gain information about a sequence of samples drawn from . We consider two query models: each query is an index and the oracle return the value and each query is a pair and the oracle gives a binary answer confirming if or not. For each of these query models, we design sequential estimation algorithms which at each round, either decide what query to send to the oracle depending on the entire history of responses or decide to stop and output an estimate of , which is required to be correct with some pre-specified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
