Physical Prompt Injection Attacks on Large Vision-Language Models

Chen Ling; Kai Hu; Hangcheng Liu; Xingshuo Han; Tianwei Zhang; Changhai Ou

arXiv:2601.17383·cs.CV·January 27, 2026

Physical Prompt Injection Attacks on Large Vision-Language Models

Chen Ling, Kai Hu, Hangcheng Liu, Xingshuo Han, Tianwei Zhang, Changhai Ou

PDF

Open Access

TL;DR

This paper introduces a novel black-box physical prompt injection attack on large vision-language models, embedding malicious visual prompts into physical objects to manipulate model outputs without model access.

Contribution

It presents the first physical, query-agnostic attack method that operates solely through visual observation and strategic placement, demonstrating high success rates across multiple models and conditions.

Findings

01

Achieves up to 98% attack success rate

02

Effective under varying physical conditions

03

Works across multiple state-of-the-art LVLMs

Abstract

Large Vision-Language Models (LVLMs) are increasingly deployed in real-world intelligent systems for perception and reasoning in open physical environments. While LVLMs are known to be vulnerable to prompt injection attacks, existing methods either require access to input channels or depend on knowledge of user queries, assumptions that rarely hold in practical deployments. We propose the first Physical Prompt Injection Attack (PPIA), a black-box, query-agnostic attack that embeds malicious typographic instructions into physical objects perceivable by the LVLM. PPIA requires no access to the model, its inputs, or internal pipeline, and operates solely through visual observation. It combines offline selection of highly recognizable and semantically effective visual prompts with strategic environment-aware placement guided by spatiotemporal attention, ensuring that the injected prompts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications