Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation

Yuyang Wanyan; Xi Zhang; Haiyang Xu; Haowei Liu; Junyang Wang; Jiabo Ye; Yutong Kou; Ming Yan; Fei Huang; Xiaoshan Yang; Weiming Dong; Changsheng Xu

arXiv:2506.04614·cs.AI·November 18, 2025

Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation

Yuyang Wanyan, Xi Zhang, Haiyang Xu, Haowei Liu, Junyang Wang, Jiabo Ye, Yutong Kou, Ming Yan, Fei Huang, Xiaoshan Yang, Weiming Dong, Changsheng Xu

PDF

Open Access 1 Models

TL;DR

This paper introduces GUI-Critic-R1, a pre-operative critic model for GUI automation that provides real-time feedback to reduce errors, using a novel suggestion reward and a new data collection pipeline, improving accuracy and efficiency.

Contribution

The paper presents a novel pre-operative critic mechanism with a suggestion-aware reward and a new data pipeline, enhancing GUI automation decision-making accuracy and reliability.

Findings

01

GUI-Critic-R1 outperforms existing MLLMs in critic accuracy.

02

The model improves success rates in GUI automation benchmarks.

03

Enhanced operational efficiency demonstrated in dynamic evaluations.

Abstract

In recent years, Multimodal Large Language Models (MLLMs) have been extensively utilized for multimodal reasoning tasks, including Graphical User Interface (GUI) automation. Unlike general offline multimodal tasks, GUI automation is executed in online interactive environments, necessitating step-by-step decision-making based on real-time status of the environment. This task has a lower tolerance for decision-making errors at each step, as any mistakes may cumulatively disrupt the process and potentially lead to irreversible outcomes like deletions or payments. To address these issues, we introduce a pre-operative critic mechanism that provides effective feedback prior to the actual execution, by reasoning about the potential outcome and correctness of actions. Specifically, we propose a Suggestion-aware Gradient Relative Policy Optimization (S-GRPO) strategy to construct our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
BonnieOne/GUI-Critic-R1
model· 16 dl· ♡ 1
16 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems