Bias-Aware Sketches
Jiecao Chen, Qin Zhang

TL;DR
This paper introduces bias-aware linear sketching algorithms that improve error guarantees for large-scale data processing, especially when data exhibits bias, validated through theoretical proofs and extensive experiments.
Contribution
It proposes the first bias-aware linear sketches, providing rigorous error guarantees and demonstrating practical superiority over existing methods.
Findings
Bias-aware sketches outperform traditional sketches in biased datasets.
Theoretical proofs establish improved error bounds for the new sketches.
Experimental results confirm the practical effectiveness of bias-aware sketches.
Abstract
Linear sketching algorithms have been widely used for processing large-scale distributed and streaming datasets. Their popularity is largely due to the fact that linear sketches can be naturally composed in the distributed model and be efficiently updated in the streaming model. The errors of linear sketches are typically expressed in terms of the sum of coordinates of the input vector excluding those largest ones, or, the mass on the tail of the vector. Thus, the precondition for these algorithms to perform well is that the mass on the tail is small, which is, however, not always the case -- in many real-world datasets the coordinates of the input vector have a {\em bias}, which will generate a large mass on the tail. In this paper we propose linear sketches that are {\em bias-aware}. We rigorously prove that they achieve strictly better error guarantees than the corresponding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Image and Video Retrieval Techniques · Sparse and Compressive Sensing Techniques
