MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation
Yi Kong, Dianxi Shi, Guoli Yang, Zhang ke-di, Chenlin Huang, Xiaopeng Li, Songchang Jin

TL;DR
MapAgent enhances mobile task automation by using trajectory-based memory to improve LLM planning, enabling more effective and context-aware execution of complex real-world tasks.
Contribution
This paper introduces a novel memory mechanism from trajectories and a coarse-to-fine planning approach for improved mobile task automation with LLMs.
Findings
Outperforms existing methods in real-world scenarios
Memory-augmented planning improves task success rates
Effective handling of complex app interactions
Abstract
The recent advancement of autonomous agents powered by Large Language Models (LLMs) has demonstrated significant potential for automating tasks on mobile devices through graphical user interfaces (GUIs). Despite initial progress, these agents still face challenges when handling complex real-world tasks. These challenges arise from a lack of knowledge about real-life mobile applications in LLM-based agents, which may lead to ineffective task planning and even cause hallucinations. To address these challenges, we propose a novel LLM-based agent framework called MapAgent that leverages memory constructed from historical trajectories to augment current task planning. Specifically, we first propose a trajectory-based memory mechanism that transforms task execution trajectories into a reusable and structured page-memory database. Each page within a trajectory is extracted as a compact yet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
