Minimum Rates of Approximate Sufficient Statistics
Masahito Hayashi, Vincent Y. F. Tan

TL;DR
This paper introduces the concept of approximate sufficient statistics, demonstrating that allowing small estimation errors can significantly reduce the code length needed to represent data from parametric distributions.
Contribution
It develops a Shannon-theoretic framework for approximate sufficient statistics, providing bounds on code length reductions and establishing strong converses for various error measures.
Findings
Code length can be reduced to (d/2) log n + O(1) with approximate statistics.
Established bounds using relative entropy and variational distance.
Proved strong converses even with non-vanishing errors.
Abstract
Given a sufficient statistic for a parametric family of distributions, one can estimate the parameter without access to the data. However, the memory or code size for storing the sufficient statistic may nonetheless still be prohibitive. Indeed, for independent samples drawn from a -nomial distribution with degrees of freedom, the length of the code scales as . In many applications, we may not have a useful notion of sufficient statistics (e.g., when the parametric family is not an exponential family) and we also may not need to reconstruct the generating distribution exactly. By adopting a Shannon-theoretic approach in which we allow a small error in estimating the generating distribution, we construct various {\em approximate sufficient statistics} and show that the code length can be reduced to . We consider errors measured…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Bayesian Methods and Mixture Models
