VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents
Tri Cao, Bennett Lim, Yue Liu, Yuan Sui, Yuexin Li, Shumin Deng, Lin Lu, Nay Oo, Shuicheng Yan, Bryan Hooi

TL;DR
This paper introduces VPI-Bench, a comprehensive benchmark assessing the vulnerability of computer-use agents to visual prompt injection attacks, revealing significant susceptibility and limited defense effectiveness.
Contribution
It presents VPI-Bench, a new benchmark with 306 test cases across multiple platforms, to evaluate and quantify vulnerabilities of CUAs and BUAs to visual prompt injection attacks.
Findings
CUAs can be deceived at rates up to 51%
BUAs can be deceived at rates up to 100%
System prompt defenses offer limited protection
Abstract
Computer-Use Agents (CUAs) with full system access enable powerful task automation but pose significant security and privacy risks due to their ability to manipulate files, access user data, and execute arbitrary commands. While prior work has focused on browser-based agents and HTML-level attacks, the vulnerabilities of CUAs remain underexplored. In this paper, we investigate Visual Prompt Injection (VPI) attacks, where malicious instructions are visually embedded within rendered user interfaces, and examine their impact on both CUAs and Browser-Use Agents (BUAs). We propose VPI-Bench, a benchmark of 306 test cases across five widely used platforms, to evaluate agent robustness under VPI threats. Each test case is a variant of a web platform, designed to be interactive, deployed in a realistic environment, and containing a visually embedded malicious prompt. Our empirical study shows…
Peer Reviews
Decision·ICLR 2026 Poster
1. The paper is clearly organized and well-written. 2. The benchmark spans multiple domains (e-commerce, travel, news, communication), evaluates a diverse set of state-of-the-art models, and incorporates both Computer-Use and Browser-Use agent frameworks, ensuring a thorough and multi-faceted evaluation of vulnerabilities. 3. The paper provides a valuable, empirical analysis of multiple contemporary defense strategies (fine-tuning, framework-level guards, system prompts). The findings that thes
1. A primary concern is the lack of a clear and significant distinction between the proposed Visual Prompt Injection attacks and previously studied pop-up attacks. This ambiguity substantially limits the perceived novelty and contribution of the work. 2. The reliance on an ensemble of LLMs as judges for evaluating agent behavior requires stronger validation. The absence of a comprehensive human study to robustly verify the accuracy and reliability of this automated evaluation method is a notable
The paper is well-writted, easy to follow, and clearly articulates an underexplored but timely problem—security risks of computer-use agents under visual prompt injection. The proposed end-to-end threat model captures realistic adversarial scenarios where visual cues alone can trigger system-level consequences, bridging a gap left by previous HTML-based or non-interactive attack studies. VPI-Bench is comprehensive and reproducible, spanning multiple platforms, tasks, and agent types, with de
While the empirical coverage is broad, the conceptual novelty is limited—the paper mainly packages known ideas (prompt injection + visual modality) into a benchmark without proposing new defensive mechanisms or theoretical contributions. Moreover, the analysis depth of why certain models or scenarios are more vulnerable is shallow—there is little interpretability or causal insight beyond quantitative rates. Finally, the defense discussion remains superficial, reiterating that existing method
The paper has the following strengths: - The paper is well written clear and easy to follow. - The benchmark looks very useful for future research - The paper looks at full agent roll outs. Both considering early and late attacks and checking if the attack was carried out to completion. - The paper shows that current multi-model agents are still highly susceptibility to simple visual pop-up based attacks, however the novelty here is limited
The paper has the following weaknesses: - The papers threat model seems unrealistic assuming that large platforms such as the BBC or Amazon have been compromised and infected with adversarial pop-ups. - To the best of my knowledge the paper does not present results for benign performance of the model on the tasks when no attack is performed. This is strange as it is standard practice in many of the prior works cited in this paper. It also provides for a much richer analysis of what might be caus
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Psychedelics and Drug Studies
