GPO-V: Jailbreak Diffusion Vision Language Model by Global Probability Optimization

Yu Pan; Andi Zhang; Yi Wang; Sibei Yang; Wenjie Wang

arXiv:2605.07399·cs.CV·May 12, 2026

GPO-V: Jailbreak Diffusion Vision Language Model by Global Probability Optimization

Yu Pan, Andi Zhang, Yi Wang, Sibei Yang, Wenjie Wang

PDF

1 Repo

TL;DR

This paper introduces GPO-V, a novel jailbreak method for diffusion vision-language models that exploits global probability dynamics to bypass safety guardrails, revealing significant security vulnerabilities.

Contribution

It proposes GPO, a new global probability optimization technique for attacking diffusion models, and introduces GPO-V, the first visual-modality jailbreak framework for dVLMs.

Findings

01

GPO-V effectively creates stealthy, transferable perturbations.

02

Current defense strategies are insufficient against GPO-based attacks.

03

GPO-V exposes critical security gaps in non-sequential diffusion models.

Abstract

Diffusion Vision-Language Models (dVLMs), built upon the non-causal foundations of Diffusion Large Language Models (dLLMs), have demonstrated remarkable efficacy in multimodal tasks by departing from the traditional autoregressive generation paradigm. While dVLMs appear inherently robust against conventional jailbreak tactics, which we categorize as Fixed Prefix Optimization (FPO) (e.g., anchoring responses with "Sure, here is"), this perceived resilience is deceptive. Our investigation into the safety landscape of dVLMs reveals a unique refusal pattern: Immediate Refusal and Progressive Refusal. We find that while FPO-based attacks often fail by triggering the latter, the progressive refinement process itself uncovers a novel, latent attack surface. To exploit this vulnerability, we propose Global Probability Optimization (GPO), a general jailbreak paradigm designed specifically for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://anonymous.4open.science/r/GPO-V-0250
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.