Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
Yongliang Wu, Shiji Zhou, Mingzhuo Yang, Lianzhe Wang, Heng Chang,, Wenbo Zhu, Xinting Hu, Xiao Zhou, Xu Yang

TL;DR
This paper introduces DoCo, a novel framework for unlearning sensitive concepts in diffusion models by aligning concept domains and preserving utility through gradient surgery, effectively removing targeted concepts with minimal utility loss.
Contribution
The paper proposes DoCo, a new concept domain correction method combined with gradient surgery to improve unlearning of sensitive concepts in diffusion models, especially for out-of-distribution prompts.
Findings
Effective unlearning of sensitive concepts across various styles and prompts.
Minimal impact on model utility after unlearning.
Outperforms previous methods in out-of-distribution scenarios.
Abstract
Text-to-image diffusion models have achieved remarkable success in generating photorealistic images. However, the inclusion of sensitive information during pre-training poses significant risks. Machine Unlearning (MU) offers a promising solution to eliminate sensitive concepts from these models. Despite its potential, existing MU methods face two main challenges: 1) limited generalization, where concept erasure is effective only within the unlearned set, failing to prevent sensitive concept generation from out-of-set prompts; and 2) utility degradation, where removing target concepts significantly impacts the model's overall performance. To address these issues, we propose a novel concept domain correction framework named \textbf{DoCo} (\textbf{Do}main \textbf{Co}rrection). By aligning the output domains of sensitive and anchor concepts through adversarial training, our approach ensures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Neural Networks and Applications
MethodsSparse Evolutionary Training · Diffusion
