Diffusion Models Memorize in Training -- and Generalize in Inference
Tim Kaiser, Markus Kollmann

TL;DR
Diffusion models tend to memorize training data during training but generalize well during inference because their sampling trajectories move outside the noisy training data domain, preventing overfitting.
Contribution
The paper uncovers how diffusion models overfit training data yet still generalize during inference due to the nature of their sampling trajectories and error dynamics.
Findings
Diffusion models overfit the denoising objective at intermediate noise levels.
The optimal denoising flow localizes around training points but error prevents exact recall.
Sampling trajectories during inference are far from noisy training samples, aiding generalization.
Abstract
Diffusion models generalize well in practice. However, an optimal diffusion model fully memorizes the training data and therefore fails to generalize, raising the question of what induces generalization in a real diffusion model. We show that, despite generalizing at the sample level, diffusion models progressively overfit the denoising training objective and thereby create a generalization gap between the performance on validation and training samples. This gap is most pronounced at intermediate noise levels. Using a fully analytic error-prone toy model, we trace the factors affecting the generalization gap. We find that the optimal denoising flow field localizes sharply around training points, but the model error suppresses the exact recall of training points, yielding a smooth, generalizing flow field. Finally, we find that the generalization gap observed in training does not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topological and Geometric Data Analysis · Medical Image Segmentation Techniques
