Mobile GUI Agents under Real-world Threats: Are We There Yet?

Guohong Liu; Jialei Ye; Jiacheng Liu; Yuanchun Li; Wei Liu; Pengzhi Gao; Jian Luan; Yunxin Liu

arXiv:2507.04227·cs.CR·April 15, 2026

Mobile GUI Agents under Real-world Threats: Are We There Yet?

Guohong Liu, Jialei Ye, Jiacheng Liu, Yuanchun Li, Wei Liu, Pengzhi Gao, Jian Luan, Yunxin Liu

PDF

2 Repos

TL;DR

This paper evaluates the robustness of mobile GUI agents powered by large language models against real-world threats, revealing significant performance degradation due to untrustworthy third-party content.

Contribution

It introduces a scalable content instrumentation framework and a comprehensive benchmark to test GUI agents under realistic, challenging app scenarios.

Findings

01

Agents' performance significantly degrades with third-party content.

02

Misleading rate averages 42.0% in dynamic and 36.1% in static environments.

03

The framework and benchmark are publicly released for further research.

Abstract

Recent years have witnessed a rapid development of mobile GUI agents powered by large language models (LLMs), which can autonomously execute diverse device-control tasks based on natural language instructions. The increasing accuracy of these agents on standard benchmarks has raised expectations for large-scale real-world deployment, and there are already several commercial agents released and used by early adopters. However, are we really ready for GUI agents integrated into our daily devices as system building blocks? We argue that an important pre-deployment validation is missing to examine whether the agents can maintain their performance under real-world threats. Specifically, unlike existing common benchmarks that are based on simple static app contents (they have to do so to ensure environment consistency between different tests), real-world apps are filled with contents from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.