MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation
Zichen Zhu, Hao Tang, Yansi Li, Dingye Liu, Hongshen Xu, Kunyao Lan, Danyang Zhang, Yixuan Jiang, Hao Zhou, Chenrun Wang, Situo Zhang, Liangtai Sun, Yixiao Wang, Yuheng Sun, Lu Chen, Kai Yu

TL;DR
MobA is a novel mobile assistant system that leverages adaptive planning, reflection, and multifaceted memory to improve handling of complex GUI interactions and task automation on mobile devices.
Contribution
MobA introduces a multifaceted memory module and an adaptive planning mechanism with reflection for error recovery, advancing mobile task automation capabilities.
Findings
MobA outperforms baselines on MobBench and AndroidArena datasets.
The system effectively manages dynamic GUI environments.
MobA demonstrates improved task success rates and adaptability.
Abstract
Existing Multimodal Large Language Model (MLLM)-based agents face significant challenges in handling complex GUI (Graphical User Interface) interactions on devices. These challenges arise from the dynamic and structured nature of GUI environments, which integrate text, images, and spatial relationships, as well as the variability in action spaces across different pages and tasks. To address these limitations, we propose MobA, a novel MLLM-based mobile assistant system. MobA introduces an adaptive planning module that incorporates a reflection mechanism for error recovery and dynamically adjusts plans to align with the real environment contexts and action module's execution capacity. Additionally, a multifaceted memory module provides comprehensive memory support to enhance adaptability and efficiency. We also present MobBench, a dataset designed for complex mobile interactions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMobile Agent-Based Network Management · Optimization and Search Problems · Multi-Agent Systems and Negotiation
MethodsALIGN · Genetic Algorithms
