Attacking Vision-Language Computer Agents via Pop-ups
Yanzhe Zhang, Tao Yu, Diyi Yang

TL;DR
This paper reveals that vision-language model-based agents are vulnerable to adversarial pop-ups, which significantly disrupt their task performance and are difficult to defend against.
Contribution
It introduces a novel attack method using adversarial pop-ups against vision-language agents and evaluates its effectiveness in real-world testing environments.
Findings
Attack success rate of 86% on average
Task success rate decreases by 47%
Basic defenses are ineffective
Abstract
Autonomous agents powered by large vision and language models (VLM) have demonstrated significant potential in completing daily computer tasks, such as browsing the web to book travel and operating desktop software, which requires agents to understand these interfaces. Despite such visual inputs becoming more integrated into agentic applications, what types of risks and attacks exist around them still remain unclear. In this work, we demonstrate that VLM agents can be easily attacked by a set of carefully designed adversarial pop-ups, which human users would typically recognize and ignore. This distraction leads agents to click these pop-ups instead of performing their tasks as usual. Integrating these pop-ups into existing agent testing environments like OSWorld and VisualWebArena leads to an attack success rate (the frequency of the agent clicking the pop-ups) of 86% on average and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNetwork Security and Intrusion Detection · Multi-Agent Systems and Negotiation · Logic, Reasoning, and Knowledge
MethodsEmirates Airlines Office in Dubai · Sparse Evolutionary Training
