CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning

Yuxuan Liu; Weikai Xu; Kun Huang; Changyu Chen; Jiankun Zhao; Pengzhi Gao; Wei Liu; Jian Luan; Shuo Shang; Bo Du; Ji-Rong Wen; Rui Yan

arXiv:2602.24142·cs.CL·March 9, 2026

CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning

Yuxuan Liu, Weikai Xu, Kun Huang, Changyu Chen, Jiankun Zhao, Pengzhi Gao, Wei Liu, Jian Luan, Shuo Shang, Bo Du, Ji-Rong Wen, Rui Yan

PDF

Open Access

TL;DR

This paper introduces CoME, a novel mobile agent architecture with four specialized experts and a progressive training strategy, enhancing hybrid-capabilities reasoning and outperforming existing methods on benchmark datasets.

Contribution

The paper proposes a new architecture with expert-specific modules and training strategies to improve hybrid-capabilities reasoning in mobile agents.

Findings

01

CoME outperforms dense mobile agents on benchmark datasets.

02

The progressive training strategy effectively enhances expert capabilities.

03

InfoGain-Driven DPO reduces error propagation in reasoning processes.

Abstract

Mobile Agents can autonomously execute user instructions, which requires hybrid-capabilities reasoning, including screen summary, subtask planning, action decision and action function. However, existing agents struggle to achieve both decoupled enhancement and balanced integration of these capabilities. To address these challenges, we propose Channel-of-Mobile-Experts (CoME), a novel agent architecture consisting of four distinct experts, each aligned with a specific reasoning stage, CoME activates the corresponding expert to generate output tokens in each reasoning stage via output-oriented activation. To empower CoME with hybrid-capabilities reasoning, we introduce a progressive training strategy: Expert-FT enables decoupling and enhancement of different experts' capability; Router-FT aligns expert activation with the different reasoning stage; CoT-FT facilitates seamless…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Multi-Agent Systems and Negotiation · Multimodal Machine Learning Applications