DeepAgent: A General Reasoning Agent with Scalable Toolsets

Xiaoxi Li; Wenxiang Jiao; Jiarui Jin; Guanting Dong; Jiajie Jin; Yinuo Wang; Hao Wang; Yutao Zhu; Ji-Rong Wen; Yuan Lu; Zhicheng Dou

arXiv:2510.21618·cs.AI·February 6, 2026

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, Guanting Dong, Jiajie Jin, Yinuo Wang, Hao Wang, Yutao Zhu, Ji-Rong Wen, Yuan Lu, Zhicheng Dou

PDF

1 Models 1 Datasets

TL;DR

DeepAgent is a comprehensive reasoning agent that autonomously discovers and uses tools through an integrated process, employing a memory management system and reinforcement learning to excel across various benchmarks.

Contribution

It introduces DeepAgent, an end-to-end reasoning framework with autonomous tool discovery, a novel memory mechanism, and a reinforcement learning strategy for stable tool use.

Findings

01

Outperforms baselines on eight diverse benchmarks.

02

Effectively manages long-horizon interactions with memory folding.

03

Demonstrates robust general-purpose tool use in open-set scenarios.

Abstract

Large reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically follow predefined workflows, which limit autonomous and global task completion. In this paper, we introduce DeepAgent, an end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, and action execution within a single, coherent reasoning process. To manage long-horizon interactions, we introduce an autonomous memory folding mechanism that compresses past interactions into structured episodic, working, and tool memories, reducing error accumulation while preserving critical information. To teach general-purpose tool use efficiently and stably, we develop an end-to-end reinforcement learning strategy, namely ToolPO, that leverages LLM-simulated APIs and applies tool-call…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
lixiaoxi45/DeepAgent-QwQ-32B
model· 126 dl
126 dl

Datasets

lixiaoxi45/DeepAgent-Datasets
dataset· 66 dl
66 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.