Approximating Large Frequency Moments with Pick-and-Drop Sampling
Vladimir Braverman, Rafail Ostrovsky

TL;DR
This paper introduces a novel sampling technique called pick-and-drop for approximating large frequency moments in data streams, achieving improved space complexity bounds for k ≥ 3.
Contribution
It presents a new non-uniform matrix sampling method that reduces space complexity for frequency moment approximation in insertion-only streams.
Findings
Achieves an $O(n^{1-2/k} ext{log}(n) ext{log}^{(c)}(n))$ space complexity bound.
Introduces pick-and-drop sampling that efficiently identifies heavy elements.
Reduces space needed for heavy element estimation to $O(n^{1-2/k} ext{log}(n))$ bits.
Abstract
Given data stream of size of numbers from , the frequency of is defined as . The -th \emph{frequency moment} of is defined as . We consider the problem of approximating frequency moments in insertion-only streams for . For any constant we show an upper bound on the space complexity of the problem. Here is the iterative function. To simplify the presentation, we make the following assumptions: and are polynomially far; approximation error and parameter are constants. We observe a natural bijection between streams and special matrices. Our main technical contribution is a non-uniform sampling method on matrices. We call our method a \emph{pick-and-drop sampling}; it samples a heavy element…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Machine Learning and Algorithms · Data Management and Algorithms
