Image-to-Text Logic Jailbreak: Your Imagination can Help You Do Anything
Xiaotian Zou, Ke Li, Yongkang Chen

TL;DR
This paper introduces a new dataset and evaluates the vulnerability of visual language models to logic-based jailbreaks using flowcharts, revealing high jailbreak success rates and highlighting security concerns.
Contribution
The study presents the Flow-JD dataset and demonstrates that current VLMs are highly susceptible to logic-based jailbreaks using flowcharts.
Findings
Jailbreak rate reaches up to 92.8% on evaluated models.
Flow-JD dataset effectively assesses logic-based vulnerabilities.
Current VLMs show significant security weaknesses.
Abstract
Large Visual Language Model\textbfs (VLMs) such as GPT-4V have achieved remarkable success in generating comprehensive and nuanced responses. Researchers have proposed various benchmarks for evaluating the capabilities of VLMs. With the integration of visual and text inputs in VLMs, new security issues emerge, as malicious attackers can exploit multiple modalities to achieve their objectives. This has led to increasing attention on the vulnerabilities of VLMs to jailbreak. Most existing research focuses on generating adversarial images or nonsensical image to jailbreak these models. However, no researchers evaluate whether logic understanding capabilities of VLMs in flowchart can influence jailbreak. Therefore, to fill this gap, this paper first introduces a novel dataset Flow-JD specifically designed to evaluate the logic-based flowchart jailbreak capabilities of VLMs. We conduct an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLaw in Society and Culture
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Adam · Dropout
