How to use KL-divergence to construct conjugate priors, with well-defined non-informative limits, for the multivariate Gaussian
Niko Br\"ummer

TL;DR
This paper introduces a method using scaled KL-divergence to construct conjugate priors for multivariate Gaussian models, enabling well-defined non-informative limits without violating distribution constraints.
Contribution
It proposes a novel approach to define Wishart and normal-Wishart priors via KL-divergence, allowing non-informative limits that preserve the priors' validity.
Findings
KL-based priors have a well-defined non-informative limit.
The method maintains the mode as the MLE in the non-informative limit.
The approach respects the shape parameter restrictions of Wishart distributions.
Abstract
The Wishart distribution is the standard conjugate prior for the precision of the multivariate Gaussian likelihood, when the mean is known -- while the normal-Wishart can be used when the mean is also unknown. It is however not so obvious how to assign values to the hyperparameters of these distributions. In particular, when forming non-informative limits of these distributions, the shape (or degrees of freedom) parameter of the Wishart must be handled with care. The intuitive solution of directly interpreting the shape as a pseudocount and letting it go to zero, as proposed by some authors, violates the restrictions on the shape parameter. We show how to use the scaled KL-divergence between multivariate Gaussians as an energy function to construct Wishart and normal-Wishart conjugate priors. When used as informative priors, the salient feature of these distributions is the mode, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Gaussian Processes and Bayesian Inference · Advanced Statistical Methods and Models
