Generalization of Diffusion Models Arises with a Balanced Representation Space
Zekai Zhang, Xiao Li, Xiang Li, Lianghe Shi, Meng Wu, Molei Tao, Qing Qu

TL;DR
This paper explores how diffusion models generalize versus memorize by analyzing their representation structures, revealing that balanced representations lead to better generalization and proposing methods for detection and control.
Contribution
It provides a theoretical analysis linking representation structures to memorization and generalization in diffusion models, and introduces practical techniques for detection and editing.
Findings
Memorization involves storing raw training data in weights, leading to localized representations.
Generalization involves capturing local data statistics, resulting in balanced representations.
Representation structures observed in real-world diffusion models align with theoretical predictions.
Abstract
Diffusion models excel at generating high-quality, diverse samples, yet they risk memorizing training data when overfit to the training objective. We analyze the distinctions between memorization and generalization in diffusion models through the lens of representation learning. By investigating a two-layer ReLU denoising autoencoder (DAE), we prove that (i) memorization corresponds to the model storing raw training samples in the learned weights for encoding and decoding, yielding localized spiky representations, whereas (ii) generalization arises when the model captures local data statistics, producing balanced representations. Furthermore, we validate these theoretical findings on real-world unconditional and text-to-image diffusion models, demonstrating that the same representation structures emerge in deep generative models with significant practical implications. Building on these…
Peer Reviews
Decision·ICLR 2026 Poster
- The paper presents a valuable contribution by proposing a more fundamental, representation-centric framework to explain the behaviors of memorization and generalization in diffusion models. - The quality of the theoretical analysis is high, and the findings are shown to be consistent with observations in real-world models, as validated through two practical applications. - The memorization detection method demonstrates high performance and broad applicability across different models and datase
- The theoretical framework is built upon a two-layer DAE. While empirical results suggest the conclusions hold more broadly, I do not think it is very clear why the findings based on the linear projection dimension $p$ vs. data size $n$ ($p\geq n$ or $p \ll n$) should transfer to deep, multi-layered UNet / DiT. - The image editing experiments are limited in scope and somewhat unclear in their details. - The representation steering method is only demonstrated on Stable Diffusion 1.4, which is
This paper benefits from both theoretical analysis and further empirical impact. Especially, through the theoretical analysis, the authors showed that data representations (spikiness or balance) could be signals for memorisation/generalisation. Such signals could be used for memorisation detection, and have high accuracy than previous metrics and meanwhile is prompt-free.
I have the following concerns or questions, which may need authors' clarifications. Also please correct me if I am wrong. 1. In case 1 of over-parameterisation, the authors found that with the optimal solution, the representations of a **single training sample** exhibits spikiness (in line 269). I am wondering if we input a different sample (not in training data) to such a neural network, whether the learned representations become a zero vector. 2. In case 2 of under-parameterisation, the au
Overall I very much enjoyed reading this paper, here are some specific points of strength: - The paper is clearly organized, and concepts are well explained and presented. - I very much like the paper is centered around rigorous theory, but makes it approachable in presentation, and demonstrates the strengths and downstream applications their theory through practical experiments with real datasets and models. - This is a paper that makes a strong contribution toward understanding diffusion model
- There are a few things about definition 3.1 that seem odd to me. What 3.1 says is that the cluster means must be separated by at least some angle related to beta. This seems like a very specific type of clustering. It excludes for example clustering in the same direction (imagine a multi-modal gaussian with a sequence of modes in one dimension). Perhaps this is a common assumption I am not aware of. Could the authors defend this choice? Is it necessary for the theorems? - It seems to me that d
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Advanced Neuroimaging Techniques and Applications
