QSketch: An Efficient Sketch for Weighted Cardinality Estimation in Streams
Yiyan Qi, Rundong Li, Pinghui Wang, Yufang Sun, Rui Xing

TL;DR
QSketch is a memory-efficient, fast, and accurate method for estimating weighted cardinality in data streams, significantly reducing resource usage compared to existing techniques.
Contribution
We introduce QSketch, a novel weighted cardinality sketch that uses quantization and dynamic properties to improve accuracy and efficiency over prior methods.
Findings
QSketch is about 30% more accurate than state-of-the-art methods.
QSketch reduces memory usage to one-eighth of previous approaches.
QSketch achieves constant time complexity for updates.
Abstract
Estimating cardinality, i.e., the number of distinct elements, of a data stream is a fundamental problem in areas like databases, computer networks, and information retrieval. This study delves into a broader scenario where each element carries a positive weight. Unlike traditional cardinality estimation, limited research exists on weighted cardinality, with current methods requiring substantial memory and computational resources, challenging for devices with limited capabilities and real-time applications like anomaly detection. To address these issues, we propose QSketch, a memory-efficient sketch method for estimating weighted cardinality in streams. QSketch uses a quantization technique to condense continuous variables into a compact set of integer variables, with each variable requiring only 8 bits, making it 8 times smaller than previous methods. Furthermore, we leverage dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Clustering Algorithms Research · Advanced Database Systems and Queries
