AgentCPM-Explore: Realizing Long-Horizon Deep Exploration for Edge-Scale Agents
Haotian Chen, Xin Cong, Shengda Fan, Yuyang Fu, Ziqin Gong, Yaxi Lu, Yishan Li, Boye Niu, Chengjun Pan, Zijun Song, Huadong Wang, Yesai Wu, Yueying Wu, Zihao Xie, Yukun Yan, Zhong Zhang, Yankai Lin, Zhiyuan Liu, Maosong Sun

TL;DR
This paper introduces AgentCPM-Explore, a 4B-parameter agent model that overcomes key bottlenecks in edge-scale models, achieving state-of-the-art results and surpassing larger models through a holistic training framework focused on stability and exploration.
Contribution
The paper presents the first systematic training framework for 4B-scale agent models, addressing core bottlenecks and significantly improving performance and stability.
Findings
Achieves 97.09% accuracy on GAIA tasks.
Surpasses 8B-class SOTA models on four benchmarks.
Outperforms larger models like Claude-4.5-Sonnet in five benchmarks.
Abstract
While Large Language Model (LLM)-based agents have shown remarkable potential for solving complex tasks, existing systems remain heavily reliant on large-scale models, leaving the capabilities of edge-scale models largely underexplored. In this paper, we present the first systematic study on training agentic models at the 4B-parameter scale. We identify three primary bottlenecks hindering the performance of edge-scale models: catastrophic forgetting during Supervised Fine-Tuning (SFT), sensitivity to reward signal noise during Reinforcement Learning (RL), and reasoning degradation caused by redundant information in long-context scenarios. To address the issues, we propose AgentCPM-Explore, a compact 4B agent model with high knowledge density and strong exploration capability. We introduce a holistic training framework featuring parameter-space model fusion, reward signal denoising, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
