TL;DR
This paper develops an information-theoretic framework for discrete diffusion models, providing principled estimators of log-likelihood and new theoretical relations that improve understanding and estimation of discrete data distributions.
Contribution
It introduces the I-MDSE and I-MDCE relations, linking mutual information to score and cross-entropy losses, offering tight, principled estimators of log-likelihood for discrete diffusion models.
Findings
The proposed estimators are accurate and stable in experiments.
Theoretical relations are tight and generalize existing bounds.
Extensions include time-free formulas and likelihood ratio estimation.
Abstract
We present an information-theoretic framework for discrete diffusion models that yields principled estimators of log-likelihood using score-matching losses. Inspired by the I-MMSE identity for the Gaussian setup, we derive analogous results for the discrete setting. Specifically, we introduce the Information-Minimum Denoising Score Entropy (I-MDSE) relation, which links mutual information between data and its diffused version to the minimum denoising score entropy (DSE) loss. We extend this theory to masked diffusion and establish the Information-Minimum Denoising Cross-Entropy (I-MDCE) relation, connecting cross-entropy losses to mutual information in discrete masked processes. These results provide a time-integral decomposition of the log-likelihood of the data in terms of optimal score-based losses, showing that commonly used losses such as DSE and DCE are not merely variational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
