Smoothing the Score Function for Generalization in Diffusion Models: An Optimization-based Explanation Framework
Xinyu Zhou, Jiawei Zhang, Stephen J. Wright

TL;DR
This paper provides a theoretical explanation for memorization in diffusion models, showing how the empirical score function's structure leads to sampling collapse, and proposes methods to improve generalization by smoothing the score function.
Contribution
It introduces a novel theoretical framework explaining memorization in diffusion models and proposes two techniques, Noise Unconditioning and Temperature Smoothing, to enhance generalization.
Findings
The empirical score function is a weighted sum of Gaussian score functions with sharp softmax weights.
Neural networks learn a smoother approximation of the score function, improving local manifold influence.
Proposed methods effectively mitigate memorization and sampling collapse, maintaining high generation quality.
Abstract
Diffusion models achieve remarkable generation quality, yet face a fundamental challenge known as memorization, where generated samples can replicate training samples exactly. We develop a theoretical framework to explain this phenomenon by showing that the empirical score function (the score function corresponding to the empirical distribution) is a weighted sum of the score functions of Gaussian distributions, in which the weights are sharp softmax functions. This structure causes individual training samples to dominate the score function, resulting in sampling collapse. In practice, approximating the empirical score function with a neural network can partially alleviate this issue and improve generalization. Our theoretical framework explains why: In training, the neural network learns a smoother approximation of the weighted sum, allowing the sampling process to be influenced by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
