Step-DeepResearch Technical Report

Chen Hu; Haikuo Du; Heng Wang; Lin Lin; Mingrui Chen; Peng Liu; Ruihang Miao; Tianchi Yue; Wang You; Wei Ji; Wei Yuan; Wenjin Deng; Xiaojian Yuan; Xiaoyun Zhang; Xiangyu Liu; Xikai Liu; Yanming Xu; Yicheng Cao; Yifei Zhang; Yongyao Wang; Yubo Shu; Yurong Zhang; Yuxiang Zhang; Zheng Gong; Zhichao Chang; Binyan Li; Dan Ma; Furong Jia; Hongyuan Wang; Jiayu Liu; Jing Bai; Junlan Liu; Manjiao Liu; Na Wang; Qiuping Wu; Qinxin Du; Shiwei Li; Wen Sun; Yifeng Gong; Yonglin Chen; Yuling Zhao; Yuxuan Lin; Ziqi Ren; Zixuan Wang; Aihu Zhang; Brian Li; Buyun Ma; Kang An; Li Xie; Mingliang Li; Pan Li; Shidong Yang; Xi Chen; Xiaojia Liu; Yuchu Luo; Yuan Song; YuanHao Ding; Yuanwei Liang; Zexi Li; Zhaoning Zhang; Zixin Zhang; Binxing Jiao; Daxin Jiang; Jiansheng Chen; Jing Li; Xiangyu Zhang; Yibo Zhu

arXiv:2512.20491·cs.CL·December 30, 2025

Step-DeepResearch Technical Report

Chen Hu, Haikuo Du, Heng Wang, Lin Lin, Mingrui Chen, Peng Liu, Ruihang Miao, Tianchi Yue, Wang You, Wei Ji, Wei Yuan, Wenjin Deng, Xiaojian Yuan, Xiaoyun Zhang, Xiangyu Liu, Xikai Liu, Yanming Xu, Yicheng Cao, Yifei Zhang, Yongyao Wang, Yubo Shu, Yurong Zhang, Yuxiang Zhang

PDF

Open Access

TL;DR

This paper introduces Step-DeepResearch, an end-to-end agent for open-ended research tasks, with novel training strategies and evaluation benchmarks, enabling medium-sized models to reach expert-level performance efficiently.

Contribution

It presents a new cost-effective agent architecture, a data synthesis strategy, a progressive training path, and a Chinese research benchmark, advancing open-ended research capabilities of medium-sized models.

Findings

01

Step-DeepResearch achieves 61.4% on Scale AI Research Rubrics.

02

It outperforms comparable models on ADR-Bench.

03

It rivals SOTA closed-source models like OpenAI and Gemini DeepResearch.

Abstract

As LLMs shift toward autonomous agents, Deep Research has emerged as a pivotal metric. However, existing academic benchmarks like BrowseComp often fail to meet real-world demands for open-ended research, which requires robust skills in intent recognition, long-horizon decision-making, and cross-source verification. To address this, we introduce Step-DeepResearch, a cost-effective, end-to-end agent. We propose a Data Synthesis Strategy Based on Atomic Capabilities to reinforce planning and report writing, combined with a progressive training path from agentic mid-training to SFT and RL. Enhanced by a Checklist-style Judger, this approach significantly improves robustness. Furthermore, to bridge the evaluation gap in the Chinese domain, we establish ADR-Bench for realistic deep research scenarios. Experimental results show that Step-DeepResearch (32B) scores 61.4% on Scale AI Research…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Multimodal Machine Learning Applications