VideoEraser: Concept Erasure in Text-to-Video Diffusion Models
Naen Xu, Jinghuai Zhang, Changjiang Li, Zhi Chen, Chunyi Zhou, Qingming Li, Tianyu Du, Shouling Ji

TL;DR
VideoEraser is a training-free, plug-and-play framework that effectively prevents text-to-video diffusion models from generating videos with undesirable concepts, enhancing safety and control without retraining the models.
Contribution
It introduces a novel two-stage, training-free method for concept erasure in T2V models, outperforming prior approaches in efficacy and robustness.
Findings
Reduces undesirable content generation by 46% on average
Outperforms prior methods in efficacy, fidelity, and robustness
Achieves state-of-the-art suppression of harmful concepts
Abstract
The rapid growth of text-to-video (T2V) diffusion models has raised concerns about privacy, copyright, and safety due to their potential misuse in generating harmful or misleading content. These models are often trained on numerous datasets, including unauthorized personal identities, artistic creations, and harmful materials, which can lead to uncontrolled production and distribution of such content. To address this, we propose VideoEraser, a training-free framework that prevents T2V diffusion models from generating videos with undesirable concepts, even when explicitly prompted with those concepts. Designed as a plug-and-play module, VideoEraser can seamlessly integrate with representative T2V diffusion models via a two-stage process: Selective Prompt Embedding Adjustment (SPEA) and Adversarial-Resilient Noise Guidance (ARNG). We conduct extensive evaluations across four tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Hate Speech and Cyberbullying Detection
