WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Zelai Xu; Zhexuan Xu; Ruize Zhang; Chunyang Zhu; Shi Yu; Weilin Liu; Quanlu Zhang; Wenbo Ding; Chao Yu; Yu Wang

arXiv:2602.04634·cs.AI·March 13, 2026

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Zelai Xu, Zhexuan Xu, Ruize Zhang, Chunyang Zhu, Shi Yu, Weilin Liu, Quanlu Zhang, Wenbo Ding, Chao Yu, Yu Wang

PDF

Open Access 3 Models 3 Datasets

TL;DR

This paper introduces WideSeek-R1, a multi-agent reinforcement learning framework that enhances broad information seeking by scaling width through parallel subagents, achieving competitive performance and demonstrating the benefits of width scaling.

Contribution

Proposes WideSeek-R1, a multi-agent RL framework that effectively scales width for broad information seeking, outperforming traditional hand-crafted multi-agent systems.

Findings

01

WideSeek-R1-4B achieves 40.0% item F1 on WideSearch benchmark.

02

Performance improves with increasing number of parallel subagents.

03

WideSeek-R1's approach is comparable to single-agent models with much larger parameters.

Abstract

Recent advancements in Large Language Models (LLMs) have largely focused on depth scaling, where a single agent solves long-horizon problems with multi-turn reasoning and tool use. However, as tasks grow broader, the key bottleneck shifts from individual competence to organizational capability. In this work, we explore a complementary dimension of width scaling with multi-agent systems to address broad information seeking. Existing multi-agent systems often rely on hand-crafted workflows and turn-taking interactions that fail to parallelize work effectively. To bridge this gap, we propose WideSeek-R1, a lead-agent-subagent framework trained via multi-agent reinforcement learning (MARL) to synergize scalable orchestration and parallel execution. By utilizing a shared LLM with isolated contexts and specialized tools, WideSeek-R1 jointly optimizes the lead agent and parallel subagents on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Topic Modeling · Big Data and Digital Economy