WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces
Sicheng Fan, Rui Wan, Yifei Leng, Gaoning Liang, Li Ling, Yanyi Shang, Dehan Kong

TL;DR
WebChain is a comprehensive, human-annotated dataset of real-world web interaction trajectories designed to advance research in web agents, featuring multi-modal data and supporting state-of-the-art performance.
Contribution
The paper introduces WebChain, the largest open-source dataset of web interactions, and a novel Dual Mid-Training method for improved web agent performance.
Findings
Achieved state-of-the-art results on WebChainBench and other GUI benchmarks.
Provided a scalable pipeline for collecting complex web interaction data.
Enabled rigorous evaluation of web agents with rich, multi-modal supervision.
Abstract
We introduce WebChain, the largest open-source dataset of human-annotated trajectories on real-world websites, designed to accelerate reproducible research in web agents. It contains 31,725 trajectories and 318k steps, featuring a core Triple Alignment of visual, structural, and action data to provide rich, multi-modal supervision. The data is collected via a scalable pipeline that ensures coverage of complex, high-value tasks often missed by synthetic methods. Leveraging this dataset, we propose a Dual Mid-Training recipe that decouples spatial grounding from planning, achieving state-of-the-art performance on our proposed WebChainBench and other public GUI benchmarks. Our work provides the data and insights necessary to build and rigorously evaluate the next generation of scalable web agents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
