On Error Propagation of Diffusion Models
Yangming Li, Mihaela van der Schaar

TL;DR
This paper develops a theoretical framework to analyze error propagation in diffusion models, demonstrating how cumulative error impacts generation quality and proposing a regularization method to mitigate this issue, leading to improved performance.
Contribution
It introduces a novel theoretical framework for error propagation in diffusion models and proposes an effective regularization technique to enhance their performance.
Findings
Regularization reduces error propagation in diffusion models
Proposed method improves image generation quality
Outperforms previous baseline methods
Abstract
Although diffusion models (DMs) have shown promising performances in a number of tasks (e.g., speech synthesis and image generation), they might suffer from error propagation because of their sequential structure. However, this is not certain because some sequential models, such as Conditional Random Field (CRF), are free from this problem. To address this issue, we develop a theoretical framework to mathematically formulate error propagation in the architecture of DMs, The framework contains three elements, including modular error, cumulative error, and propagation equation. The modular and cumulative errors are related by the equation, which interprets that DMs are indeed affected by error propagation. Our theoretical study also suggests that the cumulative error is closely related to the generation quality of DMs. Based on this finding, we apply the cumulative error as a…
Peer Reviews
Decision·ICLR 2024 poster
- The proposed method is a novel approach to measuring error propagation in diffusion models and offers a new perspective on diffusion model training. The authors argue that apart from making the denoiser network more accurate, which has been the main focus of the literature so far, it is also important to regularize such that the denoiser is also robust to errors in the input during inference. This could have significant impacts on the broader diffusion generative model community. - The presen
- The authors briefly address the trade-off between increased training time and reduced error propagation (resulting in better FID) for the 32x32 images of CIFAR and ImageNet but do not mention their CelebA experiments on 64x64 images. It is not clear if the benefits scale with the image sizes without an increased overhead as it is possible that the error estimate requires more samples or a larger sampling length $L$.
- The proposed method is easy to understand and the writing is clean and easy to follow. - The error propagation of diffusion models is an important question and worth to be studied. The topic is important.
Major: - The assumption in the core theorem is absolutely wrong. - "suppose that the output of neural network $\epsilon_\theta$ follows a standard Gaussian", which cannot be true. Because the noise-pred model corresponds to the denoising score matching loss, it is proved that the ground truth of such model is propotional to the score function of the distribution, i.e., $\nabla_{x_t} \log q_t(x_t)$. For a small $t$, such score function is quite complex and cannot be a simple and single-mode Ga
1. The theoretical framework and the bounds on error propagation through DMs are useful for analyzing robustness of DMs. 1. The proposed method results in strong significant improvements across a range of datasets. 1. The proposed method successfully decreases cumulative error in DMs
1. The proposed method requires significant compute overhead, so gains need to be weighed against this increase in compute.
Videos
Taxonomy
TopicsAdvanced Mathematical Modeling in Engineering · Advanced Neuroimaging Techniques and Applications · Music and Audio Processing
MethodsDiffusion
