Curiosity Driven Knowledge Retrieval for Mobile Agents
Sijia Li, Xiaoyu Tan, Shahir Ali, Niels Schmidt, Gengchen Ma, Xihe Qiu

TL;DR
This paper presents a curiosity-driven knowledge retrieval framework for mobile agents that enhances their ability to handle complex tasks by dynamically retrieving and integrating external information, significantly improving reliability and success rates.
Contribution
It introduces a novel curiosity score mechanism and structured AppCards for external knowledge retrieval, improving mobile agent performance in complex environments.
Findings
Achieved a 6% average performance gain on AndroidWorld benchmark.
Reaches a new state-of-the-art success rate of 88.8% with GPT-5.
AppCards are especially effective for multi-step and cross-application tasks.
Abstract
Mobile agents have made progress toward reliable smartphone automation, yet performance in complex applications remains limited by incomplete knowledge and weak generalization to unseen environments. We introduce a curiosity driven knowledge retrieval framework that formalizes uncertainty during execution as a curiosity score. When this score exceeds a threshold, the system retrieves external information from documentation, code repositories, and historical trajectories. Retrieved content is organized into structured AppCards, which encode functional semantics, parameter conventions, interface mappings, and interaction patterns. During execution, an enhanced agent selectively integrates relevant AppCards into its reasoning process, thereby compensating for knowledge blind spots and improving planning reliability. Evaluation on the AndroidWorld benchmark shows consistent improvements…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Agent-Based Network Management · Context-Aware Activity Recognition Systems · Social Robot Interaction and HRI
