Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints

Chenxi Li; Xianggan Liu; Dake Shen; Yaosong Du; Zhibo Yao; Hao Jiang; Linyi Jiang; Chengwei Cao; Jingzhe Zhang; RanYi Peng; Peiling Bai; Xiande Huang

arXiv:2603.07590·cs.CV·March 10, 2026

Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints

Chenxi Li, Xianggan Liu, Dake Shen, Yaosong Du, Zhibo Yao, Hao Jiang, Linyi Jiang, Chengwei Cao, Jingzhe Zhang, RanYi Peng, Peiling Bai, Xiande Huang

PDF

Open Access

TL;DR

This paper reveals a new vulnerability in large vision-language models where unsafe content can be assembled from seemingly benign visual prompts, leading to unsafe outputs despite safety measures.

Contribution

The paper introduces StructAttack, a novel black-box attack method that exploits semantic slot filling to generate harmful outputs from benign-looking prompts.

Findings

01

StructAttack effectively induces unsafe outputs across multiple models.

02

Semantic slot decomposition can be exploited to assemble harmful content.

03

The attack works without triggering existing safety mechanisms.

Abstract

Despite the rapid progress of Large Vision-Language Models (LVLMs), the integration of visual modalities introduces new safety vulnerabilities that adversaries can exploit to elicit biased or malicious outputs. In this paper, we demonstrate an underexplored vulnerability via semantic slot filling, where LVLMs complete missing slot values with unsafe content even when the slot types are deliberately crafted to appear benign. Building on this finding, we propose StructAttack, a simple yet effective single-query jailbreak framework under black-box settings. StructAttack decomposes a harmful query into a central topic and a set of benign-looking slot types, then embeds them as structured visual prompts (e.g., mind maps, tables, or sunburst diagrams) with small random perturbations. Paired with a completion-guided instruction, LVLMs automatically recompose the concealed semantics and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI