AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents
Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Dingyu Zhang, Shuai Ren, Hongsheng Li

TL;DR
AMEX is a large-scale, multi-annotation dataset for mobile GUI-control agents, enabling better understanding and interaction with mobile app interfaces through detailed annotations and natural language instructions.
Contribution
Introduces AMEX, a comprehensive dataset with multi-level annotations for mobile GUI understanding, and demonstrates its effectiveness by fine-tuning a baseline model.
Findings
AMEX contains over 104K annotated screenshots from mobile apps.
Fine-tuned SPHINX Agent shows improved performance on GUI tasks.
AMEX enhances research in mobile GUI interaction and understanding.
Abstract
AI agents have drawn increasing attention mostly on their ability to perceive environments, understand tasks, and autonomously achieve goals. To advance research on AI agents in mobile scenarios, we introduce the Android Multi-annotation EXpo (AMEX), a comprehensive, large-scale dataset designed for generalist mobile GUI-control agents which are capable of completing tasks by directly interacting with the graphical user interface (GUI) on mobile devices. AMEX comprises over 104K high-resolution screenshots from popular mobile applications, which are annotated at multiple levels. Unlike existing GUI-related datasets, e.g., Rico, AitW, etc., AMEX includes three levels of annotations: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions with stepwise GUI-action chains. We develop this dataset from a more instructive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsWeb Data Mining and Analysis · Context-Aware Activity Recognition Systems · Mobile Agent-Based Network Management
MethodsSoftmax · Attention Is All You Need
