K^2-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control
Zhe Wu, Donglin Mo, Hongjin Lu, Junliang Xing, Jianheng Liu, Yuheng Jing, Kai Li, Kun Shao, Jianye Hao, Yuanchun Shi

TL;DR
K2-Agent introduces a hierarchical framework that co-evolves declarative and procedural knowledge for improved long-horizon planning and skill execution in mobile device control, demonstrating strong generalization and success on challenging benchmarks.
Contribution
The paper presents K2-Agent, a novel hierarchical approach that models human-like cognition by separately and iteratively refining task-level knowledge and low-level skills, enabling better generalization and performance.
Findings
Achieves 76.1% success rate on AndroidWorld benchmark.
Demonstrates effective transfer of declarative knowledge across models.
Shows competitive performance on unseen tasks in ScreenSpot-v2 and AitW.
Abstract
Existing mobile device control agents often perform poorly when solving complex tasks requiring long-horizon planning and precise operations, typically due to a lack of relevant task experience or unfamiliarity with skill execution. We propose K2-Agent, a hierarchical framework that models human-like cognition by separating and co-evolving declarative (knowing what) and procedural (knowing how) knowledge for planning and execution. K2-Agent's high level reasoner is bootstrapped from a single demonstration per task and runs a Summarize-Reflect-Locate-Revise (SRLR) loop to distill and iteratively refine task-level declarative knowledge through self-evolution. The low-level executor is trained with our curriculum-guided Group Relative Policy Optimization (C-GRPO), which (i) constructs a balanced sample pool using decoupled reward signals and (ii) employs dynamic demonstration injection to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Advanced Software Engineering Methodologies
