A theory of learning data statistics in diffusion models, from easy to hard
Lorenzo Bardone, Claudia Merger, Sebastian Goldt

TL;DR
This paper investigates the learning dynamics of diffusion models, revealing they first learn simple pair-wise statistics before higher-order correlations, and introduces a theoretical framework to quantify this process.
Contribution
It introduces the diffusion information exponent, a scalar invariant that characterizes the sample complexity for learning different statistical features in diffusion models.
Findings
Diffusion models learn simple pair-wise statistics at linear sample complexity.
Higher-order statistics like the fourth cumulant require cubic sample complexity.
Shared latent structure can reduce the sample complexity for learning higher-order statistics.
Abstract
While diffusion models have emerged as a powerful class of generative models, their learning dynamics remain poorly understood. We address this issue first by empirically showing that standard diffusion models trained on natural images exhibit a distributional simplicity bias, learning simple, pair-wise input statistics before specializing to higher-order correlations. We reproduce this behaviour in simple denoisers trained on a minimal data model, the mixed cumulant model, where we precisely control both pair-wise and higher-order correlations of the inputs. We identify a scalar invariant of the model that governs the sample complexity of learning pair-wise and higher-order correlations that we call the diffusion information exponent, in analogy to related invariants in different learning paradigms. Using this invariant, we prove that the denoiser learns simple, pair-wise statistics of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Markov Chains and Monte Carlo Methods · Quantum many-body systems
