GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?
Chiyu Chen, Xinhao Song, Yunkai Chai, Yang Yao, Haodong Zhao, Lijun Li, Jie Li, Yan Teng, Gongshen Liu, and Yingchun Wang

TL;DR
GhostEI-Bench introduces a benchmark to evaluate the resilience of mobile vision-language agents against environmental injection attacks that manipulate visual UI elements, revealing significant vulnerabilities in current models.
Contribution
This work presents the first benchmark for assessing mobile agents under environmental injection threats, including a novel LLM-based failure analysis protocol and comprehensive vulnerability evaluation.
Findings
Current agents are highly vulnerable to deceptive UI manipulations.
Environmental injection can cause agents to misperceive or misreason about UI elements.
GhostEI-Bench enables systematic evaluation and mitigation of these threats.
Abstract
Vision-Language Models (VLMs) are increasingly deployed as autonomous agents to navigate mobile graphical user interfaces (GUIs). Operating in dynamic on-device ecosystems, which include notifications, pop-ups, and inter-app interactions, exposes them to a unique and underexplored threat vector: environmental injection. Unlike prompt-based attacks that manipulate textual instructions, environmental injection corrupts an agent's visual perception by inserting adversarial UI elements (for example, deceptive overlays or spoofed notifications) directly into the GUI. This bypasses textual safeguards and can derail execution, causing privacy leakage, financial loss, or irreversible device compromise. To systematically evaluate this threat, we introduce GhostEI-Bench, the first benchmark for assessing mobile agents under environmental injection attacks within dynamic, executable environments.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Security and Verification in Computing · Personal Information Management and User Behavior
