Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
Die Chen, Zhiwen Li, Mingyuan Fan, Cen Chen, Wenmeng Zhou, Yanhao, Wang, Yaliang Li

TL;DR
This paper introduces a novel method using growth inhibitors and an adapter to effectively suppress inappropriate concepts in diffusion models, avoiding fine-tuning and preserving image quality.
Contribution
It proposes a concept suppression technique in image space that directly erases inappropriate ideas without fine-tuning, reducing catastrophic forgetting.
Findings
Effective suppression of inappropriate concepts in diffusion models.
Minimal impact on image quality and other concepts.
Outperforms existing fine-tuning approaches.
Abstract
Despite their remarkable image generation capabilities, text-to-image diffusion models inadvertently learn inappropriate concepts from vast and unfiltered training data, which leads to various ethical and business risks. Specifically, model-generated images may exhibit not safe for work (NSFW) content and style copyright infringements. The prompts that result in these problems often do not include explicit unsafe words; instead, they contain obscure and associative terms, which are referred to as implicit unsafe prompts. Existing approaches directly fine-tune models under textual guidance to alter the cognition of the diffusion model, thereby erasing inappropriate concepts. This not only requires concept-specific fine-tuning but may also incur catastrophic forgetting. To address these issues, we explore the representation of inappropriate concepts in the image space and guide them…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive Science and Education Research
MethodsSoftmax · Attention Is All You Need · Diffusion
