T2VUnlearning: A Concept Erasing Method for Text-to-Video Diffusion Models

Xiaoyu Ye; Songjie Cheng; Yongtao Wang; Yajiao Xiong; Yishen Li

arXiv:2505.17550·cs.CV·September 30, 2025

T2VUnlearning: A Concept Erasing Method for Text-to-Video Diffusion Models

Xiaoyu Ye, Songjie Cheng, Yongtao Wang, Yajiao Xiong, Yishen Li

PDF

Open Access 1 Repo 4 Reviews

TL;DR

This paper introduces T2VUnlearning, a novel method for erasing specific concepts from text-to-video diffusion models to prevent misuse, while maintaining overall video generation quality.

Contribution

The paper proposes a new concept erasing technique combining velocity prediction fine-tuning, prompt augmentation, and regularization to effectively remove targeted concepts from T2V models.

Findings

01

Effective concept erasure demonstrated in experiments

02

Model preserves non-target concept generation

03

Outperforms existing concept erasing methods

Abstract

Recent advances in text-to-video (T2V) diffusion models have significantly enhanced the quality of generated videos. However, their capability to produce explicit or harmful content introduces new challenges related to misuse and potential rights violations. To address this newly emerging threat, we propose unlearning-based concept erasing as a solution. First, we adopt negatively-guided velocity prediction fine-tuning and enhance it with prompt augmentation to ensure robustness against prompts refined by large language models (LLMs). Second, to achieve precise unlearning, we incorporate mask-based localization regularization and concept preservation regularization to preserve the model's ability to generate non-target concepts. Extensive experiments demonstrate that our method effectively erases a specific concept while preserving the model's generation capability for all other…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 6Confidence 2

Strengths

1. Problem significance. The systematic transfer of concept unlearning from T2I to T2V addresses pressing safety and compliance needs in generative video. 2. Engineering practicality: Plug-and-play adapters are applicable to multiple public T2V backbones, and the approach appears deployment-friendly. 3. Relatively comprehensive evidence. It covers three sensitive concept families, multiple models and metrics, plus ablations and a user study. 4. Clear localisation idea. It uses QK interactions wi

Weaknesses

1. The paper criticises the reliance on LLM-refined prompts for inference in prior SOTA for undermining defences, yet later adopts LLM-based prompt augmentation in training. Please clarify the distinction in terms of stage, objective and risk, and articulate the novel contribution beyond a combination of known components. 2. Nudity evaluation largely depends on a single detector (e.g. NudeNet), so bias/misclassification may influence conclusions. There is a lack of multidetector agreement or sma

Reviewer 02Rating 4Confidence 4

Strengths

Comprehensive evaluation: multiple model families, diverse prompt distributions (including human-written), and a mix of automatic and human studies. The SafeSora and VBench analyses are appropriate, and the face-erasure study is a challenging stress test.

Weaknesses

1. VBench Object Class and Subject Consistency are valuable, but broader “video quality” and “text-video alignment” metrics (e.g., aesthetic/FLA, motion fidelity, temporal consistency beyond a single metric) could reveal subtle degradations post-unlearning. 2. The augmentation is crafted to mirror T2V training prompts; robustness to adversarial or diverse paraphrases outside the LLM’s style remains uncertain. Reporting performance under adversarially optimized prompts (beyond long/refined ones)

Reviewer 03Rating 2Confidence 5

Strengths

- Addresses an emerging and relevant topic — safety and concept erasure in T2V generation. - Attempts to explore an unlearning perspective beyond simple prompt filtering. - Provides qualitative examples and limited user study.

Weaknesses

1. Unreasonable setting and problematic methodology. The key distinction between T2V and T2I lies in temporal consistency. Video-level concept unlearning could naturally exploit temporal priors such as keyframe guidance or inversion-free conditioning. In contrast, the proposed negative-guidance approach acts purely on the noise/velocity level, which is not a principled choice for video data. This design inherits known instability and over-suppression issues from prior image-level works, leading

Reviewer 04Rating 2Confidence 4

Strengths

- The authors have adopted various T2I concept erasure techniques into T2V models and show that it works. - They show promising results on various tasks including nudity, objects and faces. - The paper is generally well written and I appreciate that the authors have openly cited works from which they have borrowed particular ideas.

Weaknesses

- The authors seem to be making a distinction between unlearning methods for T2I vs T2V diffusion models. Most methods proposed for T2I models could directly also be applied to T2V models. Thus I think claims such as "we are the first to propose an unlearning-based concept erasing method for T2V models" need to be revisited. Especially given methods such as SAFREE have already shown generalizability to T2V models. - The proposed method is an adaptation of ESD and Receler for T2V models. - The

Code & Models

Repositories

vdigpku/t2vunlearning
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimedia Communication and Technology

MethodsADaptive gradient method with the OPTimal convergence rate · Diffusion