Relative Entropy and Statistics
Fran\c{c}ois Bavaud

TL;DR
This paper explores how relative entropy, or Kullback-Leibler divergence, provides a unifying framework for various aspects of inferential statistics, including hypothesis testing, model selection, and data reconstruction.
Contribution
It demonstrates that the properties of relative entropy can formalize and unify core concepts of inferential statistics across different applications.
Findings
Relative entropy captures hypothesis refutability and model comparison.
It formalizes maximum likelihood and maximum entropy principles.
Applications include Markov chain order determination and EM-algorithm.
Abstract
Formalising the confrontation of opinions (models) to observations (data) is the task of Inferential Statistics. Information Theory provides us with a basic functional, the relative entropy (or Kullback-Leibler divergence), an asymmetrical measure of dissimilarity between the empirical and the theoretical distributions. The formal properties of the relative entropy turn out to be able to capture every aspect of Inferential Statistics, as illustrated here, for simplicity, on dices (= i.i.d. process with finitely many outcomes): refutability (strict or probabilistic): the asymmetry data / models; small deviations: rejecting a single hypothesis; competition between hypotheses and model selection; maximum likelihood: model inference and its limits; maximum entropy: reconstructing partially observed data; EM-algorithm; flow data and gravity modelling; determining the order of a Markov chain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Neural Networks and Applications · Blind Source Separation Techniques
