Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Awadallah

TL;DR
This paper introduces Explorer, a scalable approach to synthesize a large, diverse dataset of web trajectories, enabling the training of more capable multimodal web agents that outperform previous models in various benchmarks.
Contribution
The paper presents a novel scalable recipe for creating the largest diverse web trajectory dataset, significantly advancing the training and evaluation of multimodal web agents.
Findings
Explorer achieves strong performance on multiple benchmarks.
Data scaling significantly improves web agent capabilities.
Cost-effective dataset synthesis enables broader community access.
Abstract
Recent success in large multimodal models (LMMs) has sparked promising applications of agents capable of autonomously completing complex web tasks. While open-source LMM agents have made significant advances in offline evaluation benchmarks, their performance still falls substantially short of human-level capabilities in more realistic online settings. A key bottleneck is the lack of diverse and large-scale trajectory-level datasets across various domains, which are expensive to collect. In this paper, we address this challenge by developing a scalable recipe to synthesize the largest and most diverse trajectory-level dataset to date, containing over 94K successful multimodal web trajectories, spanning 49K unique URLs, 720K screenshots, and 33M web elements. In particular, we leverage extensive web exploration and refinement to obtain diverse task intents. The average cost is 28 cents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Semantic Web and Ontologies · Natural Language Processing Techniques
