Bayesian estimation of information-theoretic metrics for sparsely sampled distributions
Angelo Piga, Lluc Font-Pomarol, Marta Sales-Pardo, Roger, Guimer\`a

TL;DR
This paper introduces a fast, semi-analytical Bayesian estimator for accurately computing Shannon entropy and other information-theoretic metrics from sparse data, outperforming existing methods.
Contribution
It presents a novel hierarchical Bayesian approach that is efficient, general, and provides more precise estimates for sparsely sampled distributions.
Findings
Estimator achieves comparable or better accuracy than state-of-the-art methods.
Performs well for various information-theoretic metrics, including Kullback-Leibler divergence.
Efficient and applicable to a wide range of distributions.
Abstract
Estimating the Shannon entropy of a discrete distribution from which we have only observed a small sample is challenging. Estimating other information-theoretic metrics, such as the Kullback-Leibler divergence between two sparsely sampled discrete distributions, is even harder. Existing approaches to address these problems have shortcomings: they are biased, heuristic, work only for some distributions, and/or cannot be applied to all information-theoretic metrics. Here, we propose a fast, semi-analytical estimator for sparsely sampled distributions that is efficient, precise, and general. Its derivation is grounded in probabilistic considerations and uses a hierarchical Bayesian approach to extract as much information as possible from the few observations available. Our approach provides estimates of the Shannon entropy with precision at least comparable to the state of the art, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Anomaly Detection Techniques and Applications · Gaussian Processes and Bayesian Inference
