Bayesian estimation of the Kullback-Leibler divergence for categorical sytems using mixtures of Dirichlet priors
Francesco Camaglia, Ilya Nemenman, Thierry Mora, Aleksandra M. Walczak

TL;DR
This paper introduces a Bayesian method using Dirichlet mixtures to estimate the Kullback-Leibler divergence between categorical distributions, especially effective with small samples and high-dimensional data.
Contribution
It presents a novel Bayesian estimator for divergence measures that outperforms existing methods, applicable to complex categorical data.
Findings
Estimator outperforms empirical methods in small-sample scenarios.
Effective for high-dimensional categorical data.
Extends to squared Hellinger divergence.
Abstract
In many applications in biology, engineering and economics, identifying similarities and differences between distributions of data from complex processes requires comparing finite categorical samples of discrete counts. Statistical divergences quantify the difference between two distributions. However, their estimation is very difficult and empirical methods often fail, especially when the samples are small. We develop a Bayesian estimator of the Kullback-Leibler divergence between two probability distributions that makes use of a mixture of Dirichlet priors on the distributions being compared. We study the properties of the estimator on two examples: probabilities drawn from Dirichlet distributions, and random strings of letters drawn from Markov chains. We extend the approach to the squared Hellinger divergence. Both estimators outperform other estimation techniques, with better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Advanced Statistical Methods and Models
