HATS: Hardness-Aware Trajectory Synthesis for GUI Agents
Rui Shao, Ruize Gao, Bin Xie, Yixing Li, Kaiwen Zhou, Shuai Wang, Weili Guan, Gongwei Chen

TL;DR
HATS introduces a framework for generating more robust GUI agent trajectories by focusing on semantically ambiguous actions, improving generalization and alignment with instructions through iterative exploration and refinement.
Contribution
The paper presents a novel hardness-aware synthesis method that addresses semantic ambiguity in GUI actions, enhancing data quality for training more capable GUI agents.
Findings
Agents trained with HATS outperform baselines on benchmark environments.
HATS effectively identifies and mitigates semantic ambiguities in trajectory data.
The framework improves instruction-execution alignment in GUI tasks.
Abstract
Graphical user interface (GUI) agents powered by large vision-language models (VLMs) have shown remarkable potential in automating digital tasks, highlighting the need for high-quality trajectory data to support effective agent training. Yet existing trajectory synthesis pipelines often yield agents that fail to generalize beyond simple interactions. We identify this limitation as stemming from the neglect of semantically ambiguous actions, whose meanings are context-dependent, sequentially dependent, or visually ambiguous. Such actions are crucial for real-world robustness but are under-represented and poorly processed in current datasets, leading to semantic misalignment between task instructions and execution. To address these issues, we propose HATS, a Hardness-Aware Trajectory Synthesis framework designed to mitigate the impact of semantic ambiguity. We define hardness as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Autonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms
