Red-teaming the Multimodal Reasoning: Jailbreaking Vision-Language Models via Cross-modal Entanglement Attacks

Yu Yan; Sheng Sun; Shengjia Cheng; Teli Liu; Mingfeng Li; Min Liu

arXiv:2602.10148·cs.CR·February 12, 2026

Red-teaming the Multimodal Reasoning: Jailbreaking Vision-Language Models via Cross-modal Entanglement Attacks

Yu Yan, Sheng Sun, Shengjia Cheng, Teli Liu, Mingfeng Li, Min Liu

PDF

Open Access

TL;DR

This paper introduces CrossTALK, a scalable attack method that entangles information across visual and textual modalities to effectively jailbreak vision-language models and bypass safety mechanisms.

Contribution

The paper presents CrossTALK, a novel scalable cross-modal entanglement attack approach that surpasses existing methods in breaking VLM safety alignments.

Findings

01

CrossTALK achieves state-of-the-art attack success rates.

02

It effectively exploits multi-hop instructions and contextual clues.

03

The method demonstrates scalability and robustness in red-teaming VLMs.

Abstract

Vision-Language Models (VLMs) with multimodal reasoning capabilities are high-value attack targets, given their potential for handling complex multimodal harmful tasks. Mainstream black-box jailbreak attacks on VLMs work by distributing malicious clues across modalities to disperse model attention and bypass safety alignment mechanisms. However, these adversarial attacks rely on simple and fixed image-text combinations that lack attack complexity scalability, limiting their effectiveness for red-teaming VLMs' continuously evolving reasoning capabilities. We propose \textbf{CrossTALK} (\textbf{\underline{Cross}}-modal en\textbf{\underline{TA}}ng\textbf{\underline{L}}ement attac\textbf{\underline{K}}), which is a scalable approach that extends and entangles information clues across modalities to exceed VLMs' trained and generalized safety alignment patterns for jailbreak. Specifically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI