Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li, Zekai Zhang, Xiang Li, Siyi Chen, Zhihui Zhu, Peng Wang, Qing Qu

TL;DR
This paper investigates the unimodal representation dynamics in diffusion models, revealing how they relate to data distribution capture and model generalization through theoretical analysis and empirical evidence.
Contribution
It provides a theoretical explanation for unimodal dynamics in diffusion models based on low-dimensional data structure and links these dynamics to model generalization and memorization.
Findings
Unimodal dynamics emerge when diffusion models effectively capture data distribution.
Presence of unimodal dynamics indicates better generalization and novelty in generated images.
Unimodality transitions to monotonic decrease as models memorize training data.
Abstract
Diffusion models, though originally designed for generative tasks, have demonstrated impressive self-supervised representation learning capabilities. A particularly intriguing phenomenon in these models is the emergence of unimodal representation dynamics, where the quality of learned features peaks at an intermediate noise level. In this work, we conduct a comprehensive theoretical and empirical investigation of this phenomenon. Leveraging the inherent low-dimensionality structure of image data, we theoretically demonstrate that the unimodal dynamic emerges when the diffusion model successfully captures the underlying data distribution. The unimodality arises from an interplay between denoising strength and class confidence across noise scales. Empirically, we further show that, in classification tasks, the presence of unimodal dynamics reliably reflects the generalization of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Face recognition and analysis
MethodsDiffusion
