From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating Mobile UI Operation Impacts
Zhuohao Jerry Zhang, Eldon Schoop, Jeffrey Nichols, Anuj Mahajan,, Amanda Swearngin

TL;DR
This paper explores the impacts of AI agents operating mobile UIs, developing a taxonomy, collecting impact data, and evaluating how well large language models understand and classify these impacts to promote safer AI behavior.
Contribution
It introduces a new impact taxonomy for mobile UI actions, creates a dataset annotated with impact categories, and evaluates LLMs' understanding of UI impacts, highlighting current limitations.
Findings
Impact taxonomy improves LLM reasoning about UI actions
LLMs show gaps in classifying nuanced impact categories
Data synthesis enables realistic impact evaluation
Abstract
With advances in generative AI, there is increasing work towards creating autonomous agents that can manage daily tasks by operating user interfaces (UIs). While prior research has studied the mechanics of how AI agents might navigate UIs and understand UI structure, the effects of agents and their autonomous actions-particularly those that may be risky or irreversible-remain under-explored. In this work, we investigate the real-world impacts and consequences of mobile UI actions taken by AI agents. We began by developing a taxonomy of the impacts of mobile UI actions through a series of workshops with domain experts. Following this, we conducted a data synthesis study to gather realistic mobile UI screen traces and action data that users perceive as impactful. We then used our impact categories to annotate our collected data and data repurposed from existing mobile UI navigation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
