Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Yu Gu, Kai Zhang, Yuting Ning, Boyuan Zheng, Boyu Gou, Tianci Xue,, Cheng Chang, Sanjari Srivastava, Yanan Xie, Peng Qi, Huan Sun, Yu Su

TL;DR
This paper introduces WebDreamer, a model-based planning framework using LLMs as world models for web agents, improving efficiency and performance in real-world web tasks compared to reactive and search-based methods.
Contribution
It proposes a novel framework employing LLMs as world models for web agents and demonstrates scalable training of specialized models for improved planning efficiency.
Findings
WebDreamer outperforms reactive baselines in web tasks.
It is 4-5 times more efficient than tree search in sandbox environments.
Dreamer-7B performs comparably to GPT-4o in web planning.
Abstract
Language agents based on large language models (LLMs) have demonstrated great promise in automating web-based tasks. Recent work has shown that incorporating advanced planning algorithms, e.g., tree search, is advantageous over reactive planning for web agents. However, unlike simulated sandbox environments, real-world environments such as the web are rife with irreversible actions. This undermines the feasibility of backtracking, a cornerstone of (tree) search. Overly relying on test-time search also hurts efficiency. We advocate model-based planning for web agents that employs a world model to simulate and deliberate over the outcome of each candidate action before committing to one. We systematically explore this paradigm by (1) Proposing a model-based planning framework, WebDreamer, which employs LLMs to serve as both world models and value functions; (2) Training specialized LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Digital Rights Management and Security · Mobile Agent-Based Network Management
