Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou; Moyang Chen; Zonghao Ying; Wenzhuo Xu; Yisong Xiao; Deyue Zhang; Dongdong Yang; Zhao Liu; Xiangzheng Zhang

arXiv:2603.09246·cs.CR·March 11, 2026

Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou, Moyang Chen, Zonghao Ying, Wenzhuo Xu, Yisong Xiao, Deyue Zhang, Dongdong Yang, Zhao Liu, Xiangzheng Zhang

PDF

Open Access

TL;DR

This paper introduces Reasoning-Oriented Programming, a novel attack paradigm that chains benign visual inputs to induce harmful reasoning in large vision-language models, bypassing safety defenses.

Contribution

It formalizes the attack as a programming paradigm and develops ool{}, an automated framework to generate semantic gadgets that evade current safety alignments.

Findings

01

ool{} effectively bypasses safety alignment in 7 state-of-the-art LVLMs.

02

It outperforms existing baselines by 4.67% on open-source models.

03

It achieves a 9.50% improvement on commercial models.

Abstract

Large Vision-Language Models (LVLMs) undergo safety alignment to suppress harmful content. However, current defenses predominantly target explicit malicious patterns in the input representation, often overlooking the vulnerabilities inherent in compositional reasoning. In this paper, we identify a systemic flaw where LVLMs can be induced to synthesize harmful logic from benign premises. We formalize this attack paradigm as \textit{Reasoning-Oriented Programming}, drawing a structural analogy to Return-Oriented Programming in systems security. Just as ROP circumvents memory protections by chaining benign instruction sequences, our approach exploits the model's instruction-following capability to orchestrate a semantic collision of orthogonal benign inputs. We instantiate this paradigm via \tool{}, an automated framework that optimizes for \textit{semantic orthogonality} and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing