On the Sample Complexity of Compressed Counting
Ping Li

TL;DR
This paper introduces a simple algorithm for estimating frequency moments in data streams, significantly improving sample complexity bounds, especially for Shannon entropy estimation, based on compressed counting techniques.
Contribution
It presents a new simple algorithm using the sample minimum estimator and proves improved sample complexity bounds for compressed counting.
Findings
Enhanced sample complexity bounds for frequency moment estimation.
Effective estimation of Shannon entropy in data streams.
Simplified algorithm based on the sample minimum estimator.
Abstract
Compressed Counting (CC), based on maximally skewed stable random projections, was recently proposed for estimating the p-th frequency moments of data streams. The case p->1 is extremely useful for estimating Shannon entropy of data streams. In this study, we provide a very simple algorithm based on the sample minimum estimator and prove a much improved sample complexity bound, compared to prior results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Algorithms and Data Compression
