A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows
Francesca Romana Crucinio

TL;DR
This paper investigates the unique properties of the Kullback--Leibler divergence in gradient flow-based sampling methods, highlighting its advantage of not requiring the normalizing constant of the target distribution.
Contribution
It demonstrates that the KL divergence is the only Bregman divergence whose gradient flow avoids the need for the normalizing constant in sampling algorithms.
Findings
KL divergence's gradient flow does not require normalizing constant.
KL is unique among Bregman divergences in this property.
This insight informs the choice of divergence in sampling methods.
Abstract
We consider the problem of sampling from a probability distribution which admits a density w.r.t. a dominating measure. It is well known that this can be written as an optimisation problem over the space of probability distributions in which we aim to minimise a divergence from . The optimisation problem is normally solved through gradient flows in the space of probability distributions with an appropriate metric. We show that the Kullback--Leibler divergence is the only divergence in the family of Bregman divergences whose gradient flow w.r.t. many popular metrics does not require knowledge of the normalising constant of .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
