On the Generalization of Diffusion Model
Mingyang Yi, Jiacheng Sun, Zhenguo Li

TL;DR
This paper investigates the generalization capabilities of diffusion probabilistic models, defining it via mutual information, and proposes a new training objective to improve their ability to generate unseen data.
Contribution
It introduces a formal measure of generalization for diffusion models and proposes a new training objective to enhance their extrapolation ability.
Findings
Deterministic samplers produce data highly related to training set, indicating poor generalization.
Sufficient training leaves slight differences critical for generalization.
New training objective improves the model's ability to generate unseen data.
Abstract
The diffusion probabilistic generative models are widely used to generate high-quality data. Though they can synthetic data that does not exist in the training set, the rationale behind such generalization is still unexplored. In this paper, we formally define the generalization of the generative model, which is measured by the mutual information between the generated data and the training set. The definition originates from the intuition that the model which generates data with less correlation to the training set exhibits better generalization ability. Meanwhile, we show that for the empirical optimal diffusion model, the data generated by a deterministic sampler are all highly related to the training set, thus poor generalization. This result contradicts the observation of the trained diffusion model's (approximating empirical optima) extrapolation ability (generating unseen data).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Statistical Mechanics and Entropy
MethodsDiffusion
