You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models
Kairan Zhao, Eleni Triantafillou, Peter Triantafillou

TL;DR
This paper introduces GUARD, a novel method to reduce memorization in text-to-image diffusion models by dynamically attenuating attention during inference, effectively preventing reproduction of training data without compromising image quality.
Contribution
The paper presents a new framework and a specific attention attenuation technique that significantly mitigates memorization in diffusion models while maintaining high image quality.
Findings
GUARD effectively reduces memorization across multiple architectures.
The method maintains or improves image quality compared to existing approaches.
It demonstrates robustness against both verbatim and template memorization.
Abstract
Generative models have been shown to "memorize" certain training data, leading to verbatim or near-verbatim generating images, which may cause privacy concerns or copyright infringement. We introduce Guidance Using Attractive-Repulsive Dynamics (GUARD), a novel framework for memorization mitigation in text-to-image diffusion models. GUARD adjusts the image denoising process to guide the generation away from an original training image and towards one that is distinct from training data while remaining aligned with the prompt, guarding against reproducing training data, without hurting image generation quality. We propose a concrete instantiation of this framework, where the positive target that we steer towards is given by a novel method for (cross) attention attenuation based on (i) a novel statistical mechanism that automatically identifies the prompt positions where cross attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis
