MobileSteward: Integrating Multiple App-Oriented Agents with   Self-Evolution to Automate Cross-App Instructions

Yuxuan Liu; Hongda Sun; Wei Liu; Jian Luan; Bo Du; Rui Yan

arXiv:2502.16796·cs.MA·February 25, 2025

MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions

Yuxuan Liu, Hongda Sun, Wei Liu, Jian Luan, Bo Du, Rui Yan

PDF

TL;DR

MobileSteward is a self-evolving multi-agent framework that automates complex cross-app instructions on mobile phones by integrating specialized agents and a memory-based learning mechanism, improving task execution accuracy.

Contribution

We introduce MobileSteward, the first framework combining object-oriented multi-agent coordination with self-evolution for cross-app instruction automation.

Findings

01

MobileSteward outperforms single-agent and multi-agent baselines.

02

The Memory-based Self-evolution enhances task success rates.

03

CAPBench provides a new benchmark for cross-app instruction tasks.

Abstract

Mobile phone agents can assist people in automating daily tasks on their phones, which have emerged as a pivotal research spotlight. However, existing procedure-oriented agents struggle with cross-app instructions, due to the following challenges: (1) complex task relationships, (2) diverse app environment, and (3) error propagation and information loss in multi-step execution. Drawing inspiration from object-oriented programming principles, we recognize that object-oriented solutions is more suitable for cross-app instruction. To address these challenges, we propose a self-evolving multi-agent framework named MobileSteward, which integrates multiple app-oriented StaffAgents coordinated by a centralized StewardAgent. We design three specialized modules in MobileSteward: (1) Dynamic Recruitment generates a scheduling graph guided by information flow to explicitly associate tasks among…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.