HalluClear: Diagnosing, Evaluating and Mitigating Hallucinations in GUI Agents
Chao Jin, Wenkui Yang, Hao Sun, Yuqi Liao, Qianyi Jiang, Kai Zhou, Jie Cao, Ran He, Huaibo Huang

TL;DR
HalluClear is a comprehensive toolkit designed to diagnose, evaluate, and reduce hallucinations in GUI agents, improving their reliability and grounding with minimal additional training.
Contribution
It introduces a GUI-specific hallucination taxonomy, a calibrated evaluation workflow, and a lightweight mitigation scheme for robust GUI agent performance.
Findings
Post-training on 9K samples reduces hallucinations significantly.
HalluClear improves grounding and action fidelity in GUI agents.
The suite offers a compute-efficient pathway for robust GUI automation.
Abstract
While progress in GUI agents has been largely driven by industrial-scale training, ungrounded hallucinations often trigger cascading failures in real-world deployments.Unlike general VLM domains, the GUI agent field lacks a hallucination-focused suite for fine-grained diagnosis, reliable evaluation, and targeted mitigation.To bridge this gap, we introduce HalluClear, a comprehensive suite for hallucination mitigation in GUI agents as a complement to computation-intensive scaling. HalluClear comprises: (1) a GUI-specific hallucination taxonomy derived from empirical failure analysis; (2) a calibrated three-stage evaluation workflow which enhances VLM-as-a-judge reliability via expert-annotated benchmarking and ensemble credibility estimation; and (3) a mitigation scheme based on closed-loop structured reasoning, enabling lightweight continual post-training with cold-start initialization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
