VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation

Ziang Ye; Yang Zhang; Wentao Shi; Xiaoyu You; Fuli Feng; Tat-Seng Chua

arXiv:2507.06899·cs.CL·September 25, 2025

VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation

Ziang Ye, Yang Zhang, Wentao Shi, Xiaoyu You, Fuli Feng, Tat-Seng Chua

PDF

Open Access

TL;DR

This paper introduces VisualTrap, a novel backdoor attack method targeting GUI agents powered by vision-language models, demonstrating its effectiveness and stealthiness in hijacking visual grounding across various environments.

Contribution

The work reveals a new vulnerability in GUI agents' visual grounding and proposes VisualTrap, a practical attack method that remains effective even with minimal poisoned data and across different GUI platforms.

Findings

01

Effective hijacking with as little as 5% poisoned data

02

Stealthy triggers invisible to humans

03

Generalizes across mobile, web, and desktop environments

Abstract

Graphical User Interface (GUI) agents powered by Large Vision-Language Models (LVLMs) have emerged as a revolutionary approach to automating human-machine interactions, capable of autonomously operating personal devices (e.g., mobile phones) or applications within the device to perform complex real-world tasks in a human-like manner. However, their close integration with personal devices raises significant security concerns, with many threats, including backdoor attacks, remaining largely unexplored. This work reveals that the visual grounding of GUI agent-mapping textual plans to GUI elements-can introduce vulnerabilities, enabling new types of backdoor attacks. With backdoor attack targeting visual grounding, the agent's behavior can be compromised even when given correct task-solving plans. To validate this vulnerability, we propose VisualTrap, a method that can hijack the grounding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing