WebWorld: A Large-Scale World Model for Web Agent Training
Zikai Xiao, Jianhong Tu, Chuhang Zou, Yuxin Zuo, Zhi Li, Peng Wang, Bowen Yu, Fei Huang, Junyang Lin, and Zuozhu Liu

TL;DR
WebWorld is a large-scale open-web simulator trained on over a million interactions, enabling advanced reasoning, multi-format data handling, and long-horizon simulations, significantly improving web agent training and generalization.
Contribution
Introduces WebWorld, the first large-scale open-web simulator supporting extensive interactions and long-horizon simulations, with new benchmarks and cross-domain generalization capabilities.
Findings
WebWorld achieves simulation performance comparable to Gemini-3-Pro.
Training Qwen3-14B on WebWorld data improves WebArena performance by +9.2%.
WebWorld outperforms GPT-5 in inference-time search and generalizes across domains.
Abstract
Web agents require massive trajectories to generalize, yet real-world training is constrained by network latency, rate limits, and safety risks. We introduce \textbf{WebWorld} series, the first open-web simulator trained at scale. While existing simulators are restricted to closed environments with thousands of trajectories, WebWorld leverages a scalable data pipeline to train on 1M+ open-web interactions, supporting reasoning, multi-format data, and long-horizon simulations of 30+ steps. For intrinsic evaluation, we introduce WebWorld-Bench with dual metrics spanning nine dimensions, where WebWorld achieves simulation performance comparable to Gemini-3-Pro. For extrinsic evaluation, Qwen3-14B trained on WebWorld-synthesized trajectories improves by +9.2\% on WebArena, reaching performance comparable to GPT-4o. WebWorld enables effective inference-time search, outperforming GPT-5 as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Artificial Intelligence in Games · Reinforcement Learning in Robotics
