Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training

Tony Bonnaire; Rapha\"el Urfin; Giulio Biroli; Marc M\'ezard

arXiv:2505.17638·cs.LG·October 29, 2025

Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training

Tony Bonnaire, Rapha\"el Urfin, Giulio Biroli, Marc M\'ezard

PDF

1 Video

TL;DR

This paper investigates how diffusion models avoid memorization through implicit dynamical regularization, revealing distinct training timescales for generalization and memorization, supported by experiments and theoretical analysis.

Contribution

It identifies the role of training dynamics and timescales in preventing memorization in diffusion models, highlighting a growing window for effective generalization.

Findings

01

Memorization time $ au_{mem}$ increases linearly with dataset size $n$.

02

Generalization time $ au_{gen}$ remains constant regardless of $n$.

03

Implicit dynamical regularization prevents memorization in overparameterized models.

Abstract

Diffusion models have achieved remarkable success across a wide range of generative tasks. A key challenge is understanding the mechanisms that prevent their memorization of training data and allow generalization. In this work, we investigate the role of the training dynamics in the transition from generalization to memorization. Through extensive experiments and theoretical analysis, we identify two distinct timescales: an early time $τ_{gen}$ at which models begin to generate high-quality samples, and a later time $τ_{mem}$ beyond which memorization emerges. Crucially, we find that $τ_{mem}$ increases linearly with the training set size $n$ , while $τ_{gen}$ remains constant. This creates a growing window of training times with $n$ where models generalize effectively, despite showing strong memorization if training continues beyond it. It is only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training· slideslive

Taxonomy

MethodsConcatenated Skip Connection · Max Pooling · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · U-Net · Sparse Evolutionary Training