WebWorld: A Large-Scale World Model for Web Agent Training

Zikai Xiao; Jianhong Tu; Chuhang Zou; Yuxin Zuo; Zhi Li; Peng Wang; Bowen Yu; Fei Huang; Junyang Lin; and Zuozhu Liu

arXiv:2602.14721·cs.AI·February 17, 2026

WebWorld: A Large-Scale World Model for Web Agent Training

Zikai Xiao, Jianhong Tu, Chuhang Zou, Yuxin Zuo, Zhi Li, Peng Wang, Bowen Yu, Fei Huang, Junyang Lin, and Zuozhu Liu

PDF

Open Access

TL;DR

WebWorld is a large-scale open-web simulator trained on over a million interactions, enabling advanced reasoning, multi-format data handling, and long-horizon simulations, significantly improving web agent training and generalization.

Contribution

Introduces WebWorld, the first large-scale open-web simulator supporting extensive interactions and long-horizon simulations, with new benchmarks and cross-domain generalization capabilities.

Findings

01

WebWorld achieves simulation performance comparable to Gemini-3-Pro.

02

Training Qwen3-14B on WebWorld data improves WebArena performance by +9.2%.

03

WebWorld outperforms GPT-5 in inference-time search and generalizes across domains.

Abstract

Web agents require massive trajectories to generalize, yet real-world training is constrained by network latency, rate limits, and safety risks. We introduce \textbf{WebWorld} series, the first open-web simulator trained at scale. While existing simulators are restricted to closed environments with thousands of trajectories, WebWorld leverages a scalable data pipeline to train on 1M+ open-web interactions, supporting reasoning, multi-format data, and long-horizon simulations of 30+ steps. For intrinsic evaluation, we introduce WebWorld-Bench with dual metrics spanning nine dimensions, where WebWorld achieves simulation performance comparable to Gemini-3-Pro. For extrinsic evaluation, Qwen3-14B trained on WebWorld-synthesized trajectories improves by +9.2\% on WebArena, reaching performance comparable to GPT-4o. WebWorld enables effective inference-time search, outperforming GPT-5 as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Artificial Intelligence in Games · Reinforcement Learning in Robotics