On the Generalization Properties of Diffusion Models
Puheng Li, Zhong Li, Huishuai Zhang, Jiang Bian

TL;DR
This paper provides a theoretical analysis of diffusion models' generalization capabilities, showing they can achieve small errors without the curse of dimensionality and highlighting the impact of mode shifts in data.
Contribution
It offers the first comprehensive theoretical estimates of the generalization gap for diffusion models, including data-dependent scenarios with mode shifts.
Findings
Generalization error scales polynomially with sample size and model capacity.
Early stopping prevents exponential growth of error in high dimensions.
Mode shifts in data negatively impact model generalization.
Abstract
Diffusion models are a class of generative models that serve to establish a stochastic transport map between an empirically observed, yet unknown, target distribution and a known prior. Despite their remarkable success in real-world applications, a theoretical understanding of their generalization capabilities remains underdeveloped. This work embarks on a comprehensive theoretical exploration of the generalization attributes of diffusion models. We establish theoretical estimates of the generalization gap that evolves in tandem with the training dynamics of score-based diffusion models, suggesting a polynomially small generalization error () on both the sample size and the model capacity , evading the curse of dimensionality (i.e., not exponentially large in the data dimension) when early-stopped. Furthermore, we extend our quantitative analysis to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Theoretical and Computational Physics · Quantum many-body systems
MethodsDiffusion
