K^2-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control

Zhe Wu; Donglin Mo; Hongjin Lu; Junliang Xing; Jianheng Liu; Yuheng Jing; Kai Li; Kun Shao; Jianye Hao; Yuanchun Shi

arXiv:2603.00676·cs.AI·March 3, 2026

K^2-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control

Zhe Wu, Donglin Mo, Hongjin Lu, Junliang Xing, Jianheng Liu, Yuheng Jing, Kai Li, Kun Shao, Jianye Hao, Yuanchun Shi

PDF

Open Access

TL;DR

K2-Agent introduces a hierarchical framework that co-evolves declarative and procedural knowledge for improved long-horizon planning and skill execution in mobile device control, demonstrating strong generalization and success on challenging benchmarks.

Contribution

The paper presents K2-Agent, a novel hierarchical approach that models human-like cognition by separately and iteratively refining task-level knowledge and low-level skills, enabling better generalization and performance.

Findings

01

Achieves 76.1% success rate on AndroidWorld benchmark.

02

Demonstrates effective transfer of declarative knowledge across models.

03

Shows competitive performance on unseen tasks in ScreenSpot-v2 and AitW.

Abstract

Existing mobile device control agents often perform poorly when solving complex tasks requiring long-horizon planning and precise operations, typically due to a lack of relevant task experience or unfamiliarity with skill execution. We propose K2-Agent, a hierarchical framework that models human-like cognition by separating and co-evolving declarative (knowing what) and procedural (knowing how) knowledge for planning and execution. K2-Agent's high level reasoner is bootstrapped from a single demonstration per task and runs a Summarize-Reflect-Locate-Revise (SRLR) loop to distill and iteratively refine task-level declarative knowledge through self-evolution. The low-level executor is trained with our curriculum-guided Group Relative Policy Optimization (C-GRPO), which (i) constructs a balanced sample pool using decoupled reward signals and (ii) employs dynamic demonstration injection to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Advanced Software Engineering Methodologies