Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?
Yi Qian, Kunwei Qian, Xingbang He, Ligeng Chen, Jikang Zhang, Tiantai Zhang, Haiyang Wei, Linzhang Wang, Hao Wu, Bing Mao

TL;DR
This paper uncovers a critical security vulnerability in large multimodal GUI agents on Android, demonstrating how attackers can exploit UI state inconsistencies to bypass verification and control agents without permissions.
Contribution
It introduces Action Rebinding and Intent Alignment Strategy (IAS), novel attack techniques exploiting the UI observation-action gap in Android GUI agents, revealing a fundamental security flaw.
Findings
100% success in atomic action rebinding
Ability to orchestrate multi-step attack chains reliably
IAS increases bypass success rate to 100%
Abstract
Large multimodal model powered GUI agents are emerging as high-privilege operators on mobile platforms, entrusted with perceiving screen content and injecting inputs. However, their design operates under the implicit assumption of Visual Atomicity: that the UI state remains invariant between observation and action. We demonstrate that this assumption is fundamentally invalid in Android, creating a critical attack surface. We present Action Rebinding, a novel attack that allows a seemingly-benign app with zero dangerous permissions to rebind an agent's execution. By exploiting the inevitable observation-to-action gap inherent in the agent's reasoning pipeline, the attacker triggers foreground transitions to rebind the agent's planned action toward the target app. We weaponize the agent's task-recovery logic and Android's UI state preservation to orchestrate programmable, multi-step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Security and Verification in Computing · Adversarial Robustness in Machine Learning
