PROBE: Diagnosing Residual Concept Capacity in Erased Text-to-Video Diffusion Models
Yiwei Xie, Zheng Zhang, Ping Liu

TL;DR
PROBE introduces a diagnostic protocol to measure residual concept capacity in erased text-to-video diffusion models, revealing that current methods often suppress output but not the underlying concept representations.
Contribution
It presents a new evaluation framework, systematic experiments across architectures and erasure strategies, and uncovers temporal re-emergence as a failure mode in concept erasure.
Findings
Residual concept capacity persists after erasure
Robustness of erasure correlates with intervention depth
Temporal re-emergence causes concepts to resurface across frames
Abstract
Concept erasure techniques for text-to-video (T2V) diffusion models report substantial suppression of sensitive content, yet current evaluation is limited to checking whether the target concept is absent from generated frames, treating output-level suppression as evidence of representational removal. We introduce PROBE, a diagnostic protocol that quantifies the \textit{reactivation potential} of erased concepts in T2V models. With all model parameters frozen, PROBE optimizes a lightweight pseudo-token embedding through a denoising reconstruction objective combined with a novel latent alignment constraint that anchors recovery to the spatiotemporal structure of the original concept. We make three contributions: (1) a multi-level evaluation framework spanning classifier-based detection, semantic similarity, temporal reactivation analysis, and human validation; (2) systematic experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
