Understanding diffusion models requires rethinking (again) generalization
Pierre Marion, Yu-Han Wu

TL;DR
This paper challenges existing theories on diffusion model generalization, emphasizing the need to study what models learn before memorization and presenting empirical insights and open questions.
Contribution
It advocates shifting focus from memorization to understanding pre-memorization learning phases in diffusion models, supported by empirical analysis and open questions.
Findings
Empirical study on CIFAR-10 diffusion models conducted.
Highlights the incompatibility of memorization and generalization in diffusion models.
Proposes open questions to guide future research on diffusion model learning.
Abstract
This position paper argues that understanding generalization in diffusion models requires fundamentally new theoretical frameworks that go beyond both classical statistical learning theory and the benign overfitting paradigm developed for supervised learning. In diffusion models, unlike in supervised learning, memorization of training data and generalization to novel samples are incompatible: a model that has fully memorized its training set generates copies rather than novel data. Several theoretical explanations for why practical diffusion models nevertheless generalize have been proposed, based on capacity limitations, implicit regularization from optimization, or architectural inductive biases, but their interactions remain unclear. We argue that the field should pivot from explaining why the diffusion models do not memorize to investigating what the model actually learns during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
