Two Frames Matter: A Temporal Attack for Text-to-Video Model Jailbreaking
Moyang Chen, Zonghao Ying, Wenzhuo Xu, Quancheng Zou, Deyue Zhang, Dongdong Yang, Xiangzheng Zhang

TL;DR
This paper uncovers a temporal vulnerability in text-to-video models where sparse prompts can lead to harmful intermediate frames, and proposes a method to exploit this weakness, highlighting the need for temporally aware safety measures.
Contribution
The paper identifies a novel temporal trajectory infilling vulnerability in T2V models and introduces TFM, a framework that enhances jailbreak success by exploiting this weakness.
Findings
TFM increases attack success rate by up to 12% on commercial T2V models.
Temporal prompts can cause models to generate harmful intermediate frames.
Existing safety measures overlook temporal completion vulnerabilities.
Abstract
Recent text-to-video (T2V) models can synthesize complex videos from lightweight natural language prompts, raising urgent concerns about safety alignment in the event of misuse in the real world. Prior jailbreak attacks typically rewrite unsafe prompts into paraphrases that evade content filters while preserving meaning. Yet, these approaches often still retain explicit sensitive cues in the input text and therefore overlook a more profound, video-specific weakness. In this paper, we identify a temporal trajectory infilling vulnerability of T2V systems under fragmented prompts: when the prompt specifies only sparse boundary conditions (e.g., start and end frames) and leaves the intermediate evolution underspecified, the model may autonomously reconstruct a plausible trajectory that includes harmful intermediate frames, despite the prompt appearing benign to input or output side…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing
