TL;DR
This paper develops an algorithm-dependent generalisation theory for diffusion models, introducing score stability to explain how various implicit regularisation techniques improve model generalisation in high-dimensional settings.
Contribution
It introduces the concept of score stability and provides a framework for understanding implicit regularisation in diffusion models beyond model structure.
Findings
Score stability bounds relate to generalisation performance.
Early stopping and discretisation act as implicit regularisers.
SGD optimisation contributes to model regularisation.
Abstract
The success of denoising diffusion models raises important questions regarding their generalisation behaviour, particularly in high-dimensional settings. Notably, it has been shown that when training and sampling are performed perfectly, these models memorise training data -- implying that some form of regularisation is essential for generalisation. Existing theoretical analyses primarily rely on algorithm-independent techniques such as uniform convergence, heavily utilising model structure to obtain generalisation bounds. In this work, we instead leverage the algorithmic aspects that promote generalisation in diffusion models, developing a general theory of algorithm-dependent generalisation for this setting. Borrowing from the framework of algorithmic stability, we introduce the notion of score stability, which quantifies the sensitivity of score-matching algorithms to dataset…
Peer Reviews
Decision·ICLR 2026 Poster
**1. Introduction of an algorithm-dependent generalization framework:** The paper introduces score stability, a novel, algorithm-dependent framework for analyzing generalization in diffusion models. Unlike previous works that rely on algorithm-independent uniform convergence bounds, this approach explicitly quantifies how training and sampling algorithms affect generalization. By linking the sensitivity of the learned score function to single-sample perturbations with the expected generalization
**1. Difficulty in computing score stability:** While the paper introduces the notion of score stability and shows its theoretical connection to generalization, it does not provide a clear or practical method for computing or estimating $\epsilon_{stab}$ for a given algorithm and dataset. In practice, evaluating score stability requires measuring the sensitivity of the learned score function to changes in individual data points, which may involve retraining or coupling multiple stochastic output
The paper is relatively clearly written and the math claims appear to be rigorous (although I did not check them in detail). The notion of score stability is interesting and appears to be a useful conceptual/technical contribution. In the main text, the authors strike a good balance between presenting their results and presenting the intuition for them, without overwhelming the reader with technical detail.
I understand that the point of the paper is to prove bounds in the tradition of recent formal (mathematical) work on diffusion models, but I am still left wondering about the extent to which these bounds are tight and/or interesting in practice. There are no experiments in the paper; would it be possible to conduct any experiments to show that the bounds are interesting, or that they are at least qualitatively consistent with what one sees in experiments? Relatedly, figures (either showing schem
1. The proposed score stability framework is sound and novel. 2. The analysis is comprehensive, covering a broad range of aspects, including discretization, early stopping and optimization algorithm.
1. No empirical verification of the theorem is provided. 2. it is unclear to me how the results explain how diffusion model generalizes despite the empirical optimum can only memorize. 3. Empirically, early stopping and discretization are no the key to generalization.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
