Improving \textit{Tug-of-War} sketch using Control-Variates method
Rameshwar Pratap, Bhisham Dev Verma, Raghav Kulkarni

TL;DR
This paper enhances the Tug-of-War sketch for data streams by applying Control-Variates techniques to significantly reduce variance in estimates, improving accuracy with minimal additional computation.
Contribution
It introduces a novel variance reduction method for the Tug-of-War sketch using Control-Variates, supported by theoretical analysis and experiments.
Findings
Significant variance reduction achieved in frequency moment estimates.
Improved accuracy with minimal computational overhead.
Validated effectiveness on synthetic and real datasets.
Abstract
Computing space-efficient summary, or \textit{a.k.a. sketches}, of large data, is a central problem in the streaming algorithm. Such sketches are used to answer \textit{post-hoc} queries in several data analytics tasks. The algorithm for computing sketches typically requires to be fast, accurate, and space-efficient. A fundamental problem in the streaming algorithm framework is that of computing the frequency moments of data streams. The frequency moments of a sequence containing elements of type , are the numbers where . This is also called as norm of the frequency vector Another important problem is to compute the similarity between two data streams by computing the inner product of the corresponding frequency vectors. The seminal work of Alon, Matias, and Szegedy~\cite{AMS}, \textit{a.k.a.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Stream Mining Techniques · Advanced Database Systems and Queries
