WebSTAR: Scalable Data Synthesis for Computer Use Agents with Step-Level Filtering
Yifei He, Pranit Chawla, Yaser Souri, Subhojit Som, Xia Song

TL;DR
This paper introduces a scalable data synthesis pipeline with step-level filtering to generate high-quality training data for computer use agents, significantly improving their performance and robustness.
Contribution
The paper presents a novel step-level filtering method for synthesizing reliable training data from noisy model rollouts, enabling scalable training of computer use agents.
Findings
WebSTAR dataset with 13.3K trajectories and 267K steps created.
7B model trained on WebSTAR surpasses state-of-the-art open-source models by over 15%.
WebSCORE and StepRM provide efficient, high-quality step-level grading and reward modeling.
Abstract
Computer use agents (CUAs) can operate real-world digital interfaces but remain difficult to train due to the high cost of graphical user interface (GUI) interaction and the scarcity of high-quality trajectory data. Existing datasets rely on human demonstrations, limiting scalability. A natural alternative is to synthesize data from strong CUAs, yet their rollouts are highly noisy, with incorrect or suboptimal actions consisting a large proportion of the steps, making naive imitation ineffective. To tackle this challenge, we introduce a scalable data synthesis pipeline that transforms noisy rollouts into reliable supervision without human annotation. The core idea is step-level filtering, which evaluates actions individually to retain only correct steps, complemented by reasoning augmentation for improved planning. Using this pipeline, we construct WebSTAR, a dataset of 13.3K…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Artificial Intelligence in Games
