VII: Visual Instruction Injection for Jailbreaking Image-to-Video Generation Models

Bowen Zheng; Yongli Xiang; Ziming Hong; Zerong Lin; Chaojian Yu; Tongliang Liu; Xinge You

arXiv:2602.20999·cs.CV·March 3, 2026

VII: Visual Instruction Injection for Jailbreaking Image-to-Video Generation Models

Bowen Zheng, Yongli Xiang, Ziming Hong, Zerong Lin, Chaojian Yu, Tongliang Liu, Xinge You

PDF

Open Access 1 Datasets

TL;DR

This paper reveals a new security risk in image-to-video models where malicious instructions can be covertly embedded in images to generate harmful videos, and proposes a framework to demonstrate and exploit this vulnerability.

Contribution

It introduces Visual Instruction Injection (VII), a training-free method to disguise malicious prompts as benign visual instructions, exposing a significant security flaw in I2V models.

Findings

01

Achieves up to 83.5% attack success rate on state-of-the-art models.

02

Reduces refusal rates to near zero, enabling effective malicious content generation.

03

Outperforms existing baselines in jailbreaking image-to-video models.

Abstract

Image-to-Video (I2V) generation models, which condition video generation on reference images, have shown emerging visual instruction-following capability, allowing certain visual cues in reference images to act as implicit control signals for video generation. However, this capability also introduces a previously overlooked risk: adversaries may exploit visual instructions to inject malicious intent through the image modality. In this work, we uncover this risk by proposing Visual Instruction Injection (VII), a training-free and transferable jailbreaking framework that intentionally disguises the malicious intent of unsafe text prompts as benign visual instructions in the safe reference image. Specifically, VII coordinates a Malicious Intent Reprogramming module to distill malicious intent from unsafe text prompts while minimizing their static harmfulness, and a Visual Instruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

yonglixiang/COCO-I2VSafetyBench
dataset· 106 dl
106 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Advanced Malware Detection Techniques