DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning
Wenxuan Shi, Haochen Tan, Chuqiao Kuang, Xiaoguang Li, Xiaozhe Ren, Chen Zhang, Hanting Chen, Yasheng Wang, Lu Hou, Lifeng Shang

TL;DR
DeepDiver introduces a reinforcement learning framework that enables large language models to adaptively scale their search efforts on the open web, significantly improving information seeking capabilities beyond traditional fixed prompting methods.
Contribution
It presents WebPuzzle, a new benchmark for open-web question answering, and develops DeepDiver, an RL-based approach that enhances search intensity scaling in LLMs, outperforming existing models on real-world tasks.
Findings
DeepDiver achieves comparable performance to much larger models on open-web tasks.
The search policy generalizes from closed-ended to open-ended queries.
WebPuzzle provides a rigorous benchmark for evaluating open-web information seeking.
Abstract
Information seeking demands iterative evidence gathering and reflective reasoning, yet large language models (LLMs) still struggle with it in open-web question answering. Existing prompting and supervised fine-tuning (SFT) methods remain fixed by prompt rules or training corpora, and are usually benchmarked only on well-structured wiki sources, limiting real-world adaptability. We introduce WebPuzzle, a 24k-sample training and 275-sample test benchmark that evaluates information seeking on the live internet, across both wiki and open-domain queries. Leveraging 7k WebPuzzle instances, we develop DeepDiver, a reinforcement-learning (RL) framework that cultivates Search Intensity Scaling (SIS)-an emergent ability to escalate search frequency and depth instead of settling on overconfident, under-evidenced answers. With SIS, Qwen2.5-7B-Instruct and Pangu-7B-Reasoner attain performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Peer-to-Peer Network Technologies · Web Data Mining and Analysis
