DynaWeb: Model-Based Reinforcement Learning of Web Agents

Hang Ding; Peidong Liu; Junqiao Wang; Ziwei Ji; Meng Cao; Rongzhao Zhang; Lynn Ai; Eric Yang; Tianyu Shi; Lei Yu

arXiv:2601.22149·cs.CL·April 21, 2026

DynaWeb: Model-Based Reinforcement Learning of Web Agents

Hang Ding, Peidong Liu, Junqiao Wang, Ziwei Ji, Meng Cao, Rongzhao Zhang, Lynn Ai, Eric Yang, Tianyu Shi, Lei Yu

PDF

TL;DR

DynaWeb introduces a model-based reinforcement learning framework that trains web agents using a web world model for simulated interactions, improving efficiency and performance on web benchmarks.

Contribution

The paper presents DynaWeb, a novel MBRL approach that combines synthetic web environment rollouts with real data to enhance web agent training.

Findings

01

DynaWeb outperforms existing web agents on WebArena and WebVoyager benchmarks.

02

Using a web world model enables scalable and efficient training through imagination.

03

Interleaving real and simulated data improves training stability and sample efficiency.

Abstract

The development of autonomous web agents, powered by Large Language Models (LLMs) and reinforcement learning (RL), represents a significant step towards general-purpose AI assistants. However, training these agents is severely hampered by the challenges of interacting with the live internet, which is inefficient, costly, and fraught with risks. Model-based reinforcement learning (MBRL) offers a promising solution by learning a world model of the environment to enable simulated interaction. This paper introduces DynaWeb, a novel MBRL framework that trains web agents through interacting with a web world model trained to predict naturalistic web page representations given agent actions. This model serves as a synthetic web environment where an agent policy can dream by generating vast quantities of rollout action trajectories for efficient online reinforcement learning. Beyond free policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.