Gaussian Universality for Diffusion Models
Reza Ghane, Anthony Bao, Danil Akhtiamov, and Babak Hassibi

TL;DR
This paper demonstrates that for data generated by diffusion models, the test error of linear classifiers depends only on the first two moments of the data, establishing a Gaussian universality principle for diffusion-generated data.
Contribution
It proves a Gaussian universality result for diffusion model data, showing test error depends solely on means and covariances, and highlights challenges in extending existing universality proofs.
Findings
Test error of linear models matches between diffusion data and Gaussian mixtures.
High probability closeness of scalar functions of diffusion samples to their expectations.
Current universality proofs do not extend to diffusion data due to covariance singularities.
Abstract
We investigate Gaussian Universality for data distributions generated via diffusion models. By Gaussian Universality we mean that the test error of a generalized linear model trained for a classification task on the diffusion data matches the test error of trained on the Gaussian Mixture with matching means and covariances per class.In other words, the test error depends only on the first and second order statistics of the diffusion-generated data in the linear setting. As a corollary, the analysis of the test error for linear classifiers can be reduced to Gaussian data from diffusion-generated data. Analysing the performance of models trained on synthetic data is a pertinent problem due to the surge of methods such as \cite{sehwag2024stretchingdollardiffusiontraining}. Moreover, we show that, for any - Lipschitz scalar function ,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models
MethodsDiffusion
