Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations
Yuanmin Huang, Mi Zhang, Chen Chen, Feifei Li, Geng Hong, Xiaoyu You, Min Yang

TL;DR
This paper introduces a novel stability-based method to detect and mitigate memorization in diffusion models, effectively reducing privacy risks while maintaining image quality.
Contribution
It presents the first empirical stability analysis linking memorization to numerical instability and proposes an on-the-fly mitigation framework.
Findings
Achieves over 0.999 AUC in memorization detection.
Reduces memorization rate to 0.0% after mitigation.
Negligible overhead of approximately 0.01 seconds per image.
Abstract
While diffusion models excel at generating high-quality images, their tendency to memorize training data poses significant privacy and copyright risks. In this work, we for the first time identify that memorization induces internal numerical instability, often manifesting as visually ``broken'' artifacts. Inspired by stability analysis in numerical methods, we introduce empirical stability regions based on latent update norms to quantitatively characterize stable behavior during generation. Leveraging this, we propose a principled, on-the-fly framework for step-wise detection and adaptive mitigation. Our approach suppresses memorization without altering prompts or guidance, thereby preserving semantic fidelity and image quality. Extensive experiments on Stable Diffusion 1.4 demonstrate that our method achieves an AUC detection performance and a memorization rate after…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
