Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Saptarshi Chakraborty; Quentin Berthet; Peter L. Bartlett

arXiv:2603.03700·stat.ML·April 24, 2026

Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Saptarshi Chakraborty, Quentin Berthet, Peter L. Bartlett

PDF

TL;DR

This paper provides finite-sample error bounds for score-based diffusion models, showing they adapt to the data's intrinsic low-dimensional structure and improve convergence rates over traditional high-dimensional assumptions.

Contribution

It introduces a new theoretical framework with Wasserstein-$p$ error bounds that depend on the data's intrinsic dimension, not ambient dimension, under mild conditions.

Findings

01

Error bounds scale as $n^{-1/d^*_{p,q}(u)}$ with sample size n.

02

Diffusion models adapt to intrinsic data geometry, mitigating curse of dimensionality.

03

The $(p,q)$-Wasserstein dimension extends classical notions to unbounded distributions.

Abstract

Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often provide pessimistic convergence rates that do not reflect the intrinsic low-dimensional structure common in real data, such as that arising in natural images. In this work, we study the statistical convergence of score-based diffusion models for learning an unknown distribution $μ$ from finitely many samples. Under mild regularity conditions on the forward diffusion process and the data distribution, we derive finite-sample error bounds on the learned generative distribution, measured in the Wasserstein- $p$ distance. Unlike prior results, our guarantees hold for all $p \geq 1$ and require only a finite-moment assumption on $μ$ , without compact-support, manifold, or smooth-density conditions. Specifically, given $n$ i.i.d.\ samples from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.