LiteResearcher: A Scalable Agentic RL Training Framework for Deep Research Agent

Wanli Li; Bince Qu; Bo Pan; Jianyu Zhang; Zheng Liu; Pan Zhang; Wei Chen; Bo Zhang

arXiv:2604.17931·cs.AI·April 23, 2026

LiteResearcher: A Scalable Agentic RL Training Framework for Deep Research Agent

Wanli Li, Bince Qu, Bo Pan, Jianyu Zhang, Zheng Liu, Pan Zhang, Wei Chen, Bo Zhang

PDF

1 Models

TL;DR

LiteResearcher introduces a scalable RL training framework using a virtual world to improve deep research agents, outperforming larger models on key benchmarks.

Contribution

It presents a novel virtual world-based RL training framework that enhances scalability and performance of research agents beyond existing methods.

Findings

01

LiteResearcher-4B achieves 71.3% on GAIA and 78.0% on Xbench.

02

The framework enables small agents to outperform larger models.

03

Scalable RL training is shown to be crucial for deep research agents.

Abstract

Reinforcement Learning (RL) has emerged as a powerful training paradigm for LLM-based agents. However, scaling agentic RL for deep research remains constrained by two coupled challenges: hand-crafted synthetic data fails to elicit genuine real-world search capabilities, and real-world search dependency during RL training introduces instability and prohibitive cost, which limits the scalability of Agentic RL. LiteResearcher is a training framework that makes Agentic RL scalable: by constructing a lite virtual world that mirrors real-world search dynamics, we enable a continuously improving training recipe that empowers a tiny search agent to outperform large-scale open-source and commercial models (e.g., Tongyi DeepResearch and Claude-4.5 Sonnet). Specifically, on common benchmarks such as GAIA and Xbench, our LiteResearcher-4B achieves open-source state-of-the-art results of 71.3% and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
simplex-ai-inc/LiteResearcher-4B
model· 133 dl· ♡ 1
133 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.