If you're waiting for a sign... that might not be it! Mitigating Trust Boundary Confusion from Visual Injections on Vision-Language Agentic Systems

Jiamin Chang; Minhui Xue; Ruoxi Sun; Shuchao Pang; Salil S. Kanhere; Hammond Pearce

arXiv:2604.19844·cs.CV·April 23, 2026

If you're waiting for a sign... that might not be it! Mitigating Trust Boundary Confusion from Visual Injections on Vision-Language Agentic Systems

Jiamin Chang, Minhui Xue, Ruoxi Sun, Shuchao Pang, Salil S. Kanhere, Hammond Pearce

PDF

1 Repo

TL;DR

This paper identifies the challenge of trust boundary confusion in vision-language agents due to misleading visual signals, and proposes a defense framework to improve robustness against such injections.

Contribution

It introduces a dual-intent dataset and evaluation framework, and proposes a multi-agent defense system to mitigate visual injection vulnerabilities in embodied vision-language agents.

Findings

01

Current LVLM agents often ignore useful signals or follow harmful ones.

02

The proposed defense significantly reduces misleading behaviors.

03

The evaluation framework and code are publicly available.

Abstract

Recent advances in embodied Vision-Language Agentic Systems (VLAS), powered by large vision-language models (LVLMs), enable AI systems to perceive and reason over real-world scenes. Within this context, environmental signals such as traffic lights are essential in-band signals that can and should influence agent behavior. However, similar signals could also be crafted to operate as misleading visual injections, overriding user intent and posing security risks. This duality creates a fundamental challenge: agents must respond to legitimate environmental cues while remaining robust to misleading ones. We refer to this tension as trust boundary confusion. To study this behavior, we design a dual-intent dataset and evaluation framework, through which we show that current LVLM-based agents fail to reliably balance this trade-off, either ignoring useful signals or following harmful ones. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://anonymous.4open.science/r/Visual-Prompt-Inject
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.