Distribution of Mutual Information

Marcus Hutter

arXiv:cs/0112019·cs.AI·July 13, 2007·22 cites

Distribution of Mutual Information

Marcus Hutter

PDF

Open Access

TL;DR

This paper derives reliable, fast approximations for the distribution of mutual information between two variables, considering Bayesian priors, to improve inference beyond point estimates in statistical learning.

Contribution

It provides new methods to compute the distribution of mutual information under Bayesian priors, including mean, variance, skewness, and kurtosis, with practical approximations.

Findings

01

Derived accurate approximations for p(I|n)

02

Provided exact expression for the mean mutual information

03

Discussed numerical stability and applicability range

Abstract

The mutual information of two random variables i and j with joint probabilities t_ij is commonly used in learning Bayesian nets as well as in many other fields. The chances t_ij are usually estimated by the empirical sampling frequency n_ij/n leading to a point estimate I(n_ij/n) for the mutual information. To answer questions like "is I(n_ij/n) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?" one has to go beyond the point estimate. In the Bayesian framework one can answer these questions by utilizing a (second order) prior distribution p(t) comprising prior information about t. From the prior p(t) one can compute the posterior p(t|n), from which the distribution p(I|n) of the mutual information can be calculated. We derive reliable and quickly computable approximations for p(I|n). We concentrate on the mean,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference