Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data
Saptarshi Chakraborty, Quentin Berthet, Peter L. Bartlett

TL;DR
This paper provides finite-sample error bounds for score-based diffusion models, showing they adapt to the data's intrinsic low-dimensional structure and improve convergence rates over traditional high-dimensional assumptions.
Contribution
It introduces a new theoretical framework with Wasserstein-$p$ error bounds that depend on the data's intrinsic dimension, not ambient dimension, under mild conditions.
Findings
Error bounds scale as $n^{-1/d^*_{p,q}(u)}$ with sample size n.
Diffusion models adapt to intrinsic data geometry, mitigating curse of dimensionality.
The $(p,q)$-Wasserstein dimension extends classical notions to unbounded distributions.
Abstract
Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often provide pessimistic convergence rates that do not reflect the intrinsic low-dimensional structure common in real data, such as that arising in natural images. In this work, we study the statistical convergence of score-based diffusion models for learning an unknown distribution from finitely many samples. Under mild regularity conditions on the forward diffusion process and the data distribution, we derive finite-sample error bounds on the learned generative distribution, measured in the Wasserstein- distance. Unlike prior results, our guarantees hold for all and require only a finite-moment assumption on , without compact-support, manifold, or smooth-density conditions. Specifically, given i.i.d.\ samples from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
