How Smart Is Your GUI Agent? A Framework for the Future of Software Interaction
Sidong Feng, Chunyang Chen

TL;DR
This paper introduces a six-level framework called GUI Agent Autonomy Levels (GAL) to clarify the autonomy of GUI agents, enabling better benchmarking and understanding of their capabilities, responsibilities, and risks in software interaction.
Contribution
It proposes a novel six-level framework for explicitly measuring and benchmarking GUI agent autonomy, addressing ambiguity in current agent capabilities.
Findings
The GAL framework clarifies autonomy levels in GUI agents.
It facilitates benchmarking and comparison of GUI agent capabilities.
The framework promotes development of trustworthy software interaction.
Abstract
GUI agents are rapidly becoming a new interaction to software, allowing people to navigate web, desktop and mobile rather than execute them click by click. Yet ``agent'' is described with radically different degrees of autonomy, obscuring capability, responsibility and risk. We call for conceptual clarity through GUI Agent Autonomy Levels (GAL), a six-level framework that makes autonomy explicit and helps benchmark progress toward trustworthy software interaction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Advanced Software Engineering Methodologies · Usability and User Interface Design
