Estimating Entropy of Distributions in Constant Space
Jayadev Acharya, Sourbh Bhadane, Piotr Indyk, Ziteng Sun

TL;DR
This paper introduces a streaming algorithm for estimating the entropy of k-ary distributions using constant space, achieving near-optimal sample complexity with minimal memory.
Contribution
The authors present a novel constant-space streaming algorithm for entropy estimation that matches the sample complexity of space-intensive methods.
Findings
Requires only O(1) memory words of space.
Achieves a sample complexity of O(k log(1/ε)^2 / ε^3).
Provides accurate entropy estimates within ±ε.
Abstract
We consider the task of estimating the entropy of -ary distributions from samples in the streaming model, where space is limited. Our main contribution is an algorithm that requires samples and a constant memory words of space and outputs a estimate of . Without space limitations, the sample complexity has been established as , which is sub-linear in the domain size , and the current algorithms that achieve optimal sample complexity also require nearly-linear space in . Our algorithm partitions into intervals and estimates the entropy contribution of probability values in each interval. The intervals are designed to trade off the bias and variance of these estimates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Data Stream Mining Techniques
