Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model
Hongxu Chen, Zhen Wang, Taoran Mei, Lin Li, Bowei Zhu, Runshi Li, Long Chen

TL;DR
This paper introduces ErasePro, a novel method for concept erasure in text-to-image models that ensures complete removal of target concepts while maintaining high image quality through progressive, layer-wise updates.
Contribution
ErasePro employs a zero-residual constraint and a layer-wise update strategy to improve concept erasure completeness and preserve generative quality in text-to-image models.
Findings
Effective in erasing various concepts like nudity and art style.
Achieves more complete erasure compared to previous methods.
Maintains high image quality after erasure.
Abstract
Concept Erasure, which aims to prevent pretrained text-to-image models from generating content associated with semantic-harmful concepts (i.e., target concepts), is getting increased attention. State-of-the-art methods formulate this task as an optimization problem: they align all target concepts with semantic-harmless anchor concepts, and apply closed-form solutions to update the model accordingly. While these closed-form methods are efficient, we argue that existing methods have two overlooked limitations: 1) They often result in incomplete erasure due to "non-zero alignment residual", especially when text prompts are relatively complex. 2) They may suffer from generation quality degradation as they always concentrate parameter updates in a few deep layers. To address these issues, we propose a novel closed-form method ErasePro: it is designed for more complete concept erasure and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
