SPARK: Jailbreaking T2V Models by Synergistically Prompting Auditory and Recontextualized Knowledge
Zonghao Ying, Moyang Chen, Nizhang Li, Zhiqiang Wang, Wenxin Zhang, Quanchen Zou, Zonglei Jing, Aishan Liu, Xianglong Liu

TL;DR
This paper introduces SPARK, a novel prompt-based attack method that exploits cross-modal associations to generate unsafe videos from benign prompts, revealing vulnerabilities in text-to-video models.
Contribution
SPARK leverages a modular prompt design combining scene anchors, auditory triggers, and stylistic modulators to effectively jailbreak T2V models using implicit cues.
Findings
Achieved +23% success rate in attacking 7 T2V models.
Demonstrated stealthy prompts can induce unsafe content.
Validated effectiveness across multiple commercial models.
Abstract
Jailbreak attacks can circumvent model safety guardrails and reveal critical blind spots. Prior attacks on text-to-video (T2V) models typically add adversarial perturbations to obviously unsafe prompts, which are often easy to detect and defend. In contrast, we show that benign-looking prompts containing rich, implicit cues can induce T2V models to generate semantically unsafe videos that both violate policy and preserve the original (blocked) intent. To realize this, we propose SPARK, a jailbreak framework that leverages T2V models cross-modal associative patterns via a modular prompt design. Specifically, our prompts combine three components: neutral scene anchors, which provide the surface-level scene description extracted from the blocked intent to maintain plausibility; latent auditory triggers, textual descriptions of innocuous-sounding audio events (e.g., creaking, muffled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Advanced Malware Detection Techniques
