Structured Distillation of Web Agent Capabilities Enables Generalization
Xing Han L\`u, Siva Reddy

TL;DR
This paper presents a structured approach to training web agents using synthetic trajectories generated by LLMs, enabling competitive performance and transferability across unseen environments.
Contribution
The authors introduce Agent-as-Annotators, a modular framework for synthetic trajectory generation that improves web agent generalization and performance with a single teacher model.
Findings
Achieved 41.5% on WebArena, surpassing closed-source models.
Nearly doubled previous open-weight best result (21.7%).
Transferred capabilities effectively to unseen environments.
Abstract
Frontier LLMs can navigate complex websites, but their cost and reliance on third-party APIs make local deployment impractical. We introduce Agent-as-Annotators, a framework that structures synthetic trajectory generation for web agents by analogy to human annotation roles, replacing the Task Designer, Annotator, and Supervisor with modular LLM components. Using Gemini 3 Pro as teacher, we generate 3,000 trajectories across six web environments and fine-tune a 9B-parameter student with pure supervised learning on the 2,322 that pass quality filtering. The resulting model achieves 41.5% on WebArena, surpassing closed-source models such as Claude 3.5 Sonnet (36.0%) and GPT-4o (31.5%) under the same evaluation protocol, and nearly doubling the previous best open-weight result (Go-Browse, 21.7%). Capabilities transfer to unseen environments, with an 18.2 percentage point gain on WorkArena…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
