Exponential families from a single KL identity

Marc Dymetman

arXiv:2604.28036·cs.LG·May 1, 2026

Exponential families from a single KL identity

Marc Dymetman

PDF

TL;DR

This paper reveals a simple KL identity for exponential families that unifies and simplifies many classical results in information theory, variational inference, and reinforcement learning.

Contribution

It introduces a fundamental KL identity for exponential families that derives multiple key results through straightforward algebraic manipulation.

Findings

01

Unified derivation of classical results in exponential family theory

02

Explicit formulas for KL divergence and log-partition function properties

03

Simplified proofs of variational principles and optimality conditions

Abstract

Exponential families encompass the distributions central to modern machine learning -- softmax, Gaussians, and Boltzmann distributions -- and underlie the theory of variational inference, entropy-regularized reinforcement learning, and RLHF. We isolate a simple identity for exponential families that expresses the KL difference $KL (q ∥ p_{λ_{2}}) - KL (q ∥ p_{λ_{1}})$ in terms of the log-partition function $A (λ)$ and the moment $μ_{q}$ . Remarkably, this identity together with the single fact that $KL \geq 0$ (with equality iff $p = q$ ) suffices, by direct substitution and rearrangement, to derive a cluster of results that are classically obtained by separate, heavier arguments: a generalized three-point identity for arbitrary reference distributions, Pythagorean theorems for I-projections and reverse I-projections, convexity of the log-partition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.