Information Theoretic Learning for Diffusion Models with Warm Start

Yirong Shen; Lu Gan; Cong Ling

arXiv:2510.20903·cs.IT·October 27, 2025

Information Theoretic Learning for Diffusion Models with Warm Start

Yirong Shen, Lu Gan, Cong Ling

PDF

TL;DR

This paper introduces an information theoretic approach to diffusion models, providing a tighter likelihood bound that improves training efficiency and accuracy, and demonstrates state-of-the-art results on image datasets.

Contribution

It extends classical KL divergence relationships to arbitrary noise, enabling structured noise use and improving likelihood estimation in diffusion models.

Findings

01

Achieves competitive NLL on CIFAR-10

02

Sets SOTA results on ImageNet

03

Works effectively without data augmentation

Abstract

Generative models that maximize model likelihood have gained traction in many practical settings. Among them, perturbation based approaches underpin many strong likelihood estimation models, yet they often face slow convergence and limited theoretical understanding. In this paper, we derive a tighter likelihood bound for noise driven models to improve both the accuracy and efficiency of maximum likelihood learning. Our key insight extends the classical KL divergence Fisher information relationship to arbitrary noise perturbations, going beyond the Gaussian assumption and enabling structured noise distributions. This formulation allows flexible use of randomized noise distributions that naturally account for sensor artifacts, quantization effects, and data distribution smoothing, while remaining compatible with standard diffusion training. Treating the diffusion process as a Gaussian…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.