Benchmarking Mobile Device Control Agents across Diverse Configurations
Juyong Lee, Taywon Min, Minyong An, Dongyoon Hahm, Haeone Lee, Changyeon Kim, Kimin Lee

TL;DR
This paper introduces B-MoCA, a comprehensive benchmark with diverse, randomized tasks on Android devices to evaluate and advance mobile device control agents, highlighting current limitations and future research directions.
Contribution
We present B-MoCA, a new benchmark with 131 tasks and randomized configurations for evaluating mobile control agents, facilitating standardized progress measurement.
Findings
Agents perform well on simple tasks
Complex tasks reveal significant performance gaps
Benchmark enables assessment of generalization capabilities
Abstract
Mobile device control agents can largely enhance user interactions and productivity by automating daily tasks. However, despite growing interest in developing practical agents, the absence of a commonly adopted benchmark in this area makes it challenging to quantify scientific progress. In this work, we introduce B-MoCA: a novel benchmark with interactive environments for evaluating and developing mobile device control agents. To create a realistic benchmark, we develop B-MoCA based on the Android operating system and define 131 common daily tasks. Importantly, we incorporate a randomization feature that changes the configurations of mobile devices, including user interface layouts and language settings, to assess generalization performance. We benchmark diverse agents, including agents employing large language models (LLMs) or multi-modal LLMs as well as agents trained with imitation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Agent-Based Network Management · Multi-Agent Systems and Negotiation · Peer-to-Peer Network Technologies
