Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?

Yi Qian; Kunwei Qian; Xingbang He; Ligeng Chen; Jikang Zhang; Tiantai Zhang; Haiyang Wei; Linzhang Wang; Hao Wu; Bing Mao

arXiv:2601.12349·cs.CR·March 4, 2026

Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?

Yi Qian, Kunwei Qian, Xingbang He, Ligeng Chen, Jikang Zhang, Tiantai Zhang, Haiyang Wei, Linzhang Wang, Hao Wu, Bing Mao

PDF

Open Access

TL;DR

This paper uncovers a critical security vulnerability in large multimodal GUI agents on Android, demonstrating how attackers can exploit UI state inconsistencies to bypass verification and control agents without permissions.

Contribution

It introduces Action Rebinding and Intent Alignment Strategy (IAS), novel attack techniques exploiting the UI observation-action gap in Android GUI agents, revealing a fundamental security flaw.

Findings

01

100% success in atomic action rebinding

02

Ability to orchestrate multi-step attack chains reliably

03

IAS increases bypass success rate to 100%

Abstract

Large multimodal model powered GUI agents are emerging as high-privilege operators on mobile platforms, entrusted with perceiving screen content and injecting inputs. However, their design operates under the implicit assumption of Visual Atomicity: that the UI state remains invariant between observation and action. We demonstrate that this assumption is fundamentally invalid in Android, creating a critical attack surface. We present Action Rebinding, a novel attack that allows a seemingly-benign app with zero dangerous permissions to rebind an agent's execution. By exploiting the inevitable observation-to-action gap inherent in the agent's reasoning pipeline, the attacker triggers foreground transitions to rebind the agent's planned action toward the target app. We weaponize the agent's task-recovery logic and Android's UI state preservation to orchestrate programmable, multi-step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Security and Verification in Computing · Adversarial Robustness in Machine Learning