Hidden Sketch: A Space-Efficient Reversible Sketch for Tracking Frequent Items in Data Streams
Zicang Xu, Yuxuan Tian, Yuhan Wu, Tong Yang

TL;DR
Hidden Sketch is a novel, space-efficient reversible data structure that accurately tracks frequent items in data streams by combining reversible Bloom filters and Count-Min sketches, overcoming traditional accuracy-memory trade-offs.
Contribution
It introduces Hidden Sketch, a reversible sketch that achieves high accuracy and space efficiency simultaneously, a significant advancement over existing methods.
Findings
Outperforms existing sketches in accuracy and space usage
Provides theoretical guarantees on space complexity and reversibility
Demonstrates scalability in real-time data stream analytics
Abstract
Modern data stream applications demand memory-efficient solutions for accurately tracking frequent items, such as heavy hitters and heavy changers, under strict resource constraints. Traditional sketches face inherent accuracy-memory trade-offs: they either lose precision to reduce memory usage or inflate memory costs to enable high recording capacity. This paper introduces Hidden Sketch, a space-efficient reversible data structure for key and frequency encoding. Our design uniquely combines a Reversible Bloom Filter (RBF) and a Count-Min (CM) Sketch for invertible key and frequency storage, enabling precise reconstruction for both keys and their frequencies with minimal memory. Theoretical analysis establishes Hidden Sketch's space complexity and guaranteed reversibility, while extensive experiments demonstrate its substantial improvements in accuracy and space efficiency in frequent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Database Systems and Queries · Time Series Analysis and Forecasting
