JPRO: Automated Multimodal Jailbreaking via Multi-Agent Collaboration Framework

Yuxuan Zhou; Yang Bai; Kuofeng Gao; Tao Dai; Shu-Tao Xia

arXiv:2511.07315·cs.CR·November 11, 2025

JPRO: Automated Multimodal Jailbreaking via Multi-Agent Collaboration Framework

Yuxuan Zhou, Yang Bai, Kuofeng Gao, Tao Dai, Shu-Tao Xia

PDF

Open Access

TL;DR

JPRO is an automated multi-agent framework that significantly improves the diversity, scalability, and success rate of jailbreaking large multimodal vision-language models, revealing critical security vulnerabilities.

Contribution

It introduces a novel multi-agent collaborative framework with seed generation and adaptive optimization for effective, diverse, and scalable VLM jailbreaking.

Findings

01

Achieves over 60% attack success rate on multiple VLMs

02

Outperforms existing methods in attack diversity and scalability

03

Uncovers critical security vulnerabilities in multimodal models

Abstract

The widespread application of large VLMs makes ensuring their secure deployment critical. While recent studies have demonstrated jailbreak attacks on VLMs, existing approaches are limited: they require either white-box access, restricting practicality, or rely on manually crafted patterns, leading to poor sample diversity and scalability. To address these gaps, we propose JPRO, a novel multi-agent collaborative framework designed for automated VLM jailbreaking. It effectively overcomes the shortcomings of prior methods in attack diversity and scalability. Through the coordinated action of four specialized agents and its two core modules: Tactic-Driven Seed Generation and Adaptive Optimization Loop, JPRO generates effective and diverse attack samples. Experimental results show that JPRO achieves over a 60\% attack success rate on multiple advanced VLMs, including GPT-4o, significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing