SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration

Qingni Wang; Yue Fan; Xin Eric Wang

arXiv:2602.02419·cs.AI·February 4, 2026

SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration

Qingni Wang, Yue Fan, Xin Eric Wang

PDF

Open Access

TL;DR

SafeGround introduces an uncertainty calibration framework for GUI grounding models, enabling risk-aware predictions with controlled false discovery rates, leading to improved system accuracy and reliability.

Contribution

It presents a novel uncertainty-aware calibration method for GUI grounding models that enhances prediction reliability and risk management.

Findings

01

Outperforms existing baselines in distinguishing correct from incorrect predictions.

02

Calibrated thresholds enable rigorous risk control and accuracy improvements.

03

Improves system-level accuracy by up to 5.38 percentage points.

Abstract

Graphical User Interface (GUI) grounding aims to translate natural language instructions into executable screen coordinates, enabling automated GUI interaction. Nevertheless, incorrect grounding can result in costly, hard-to-reverse actions (e.g., erroneous payment approvals), raising concerns about model reliability. In this paper, we introduce SafeGround, an uncertainty-aware framework for GUI grounding models that enables risk-aware predictions through calibrations before testing. SafeGround leverages a distribution-aware uncertainty quantification method to capture the spatial dispersion of stochastic samples from outputs of any given model. Then, through the calibration process, SafeGround derives a test-time decision threshold with statistically guaranteed false discovery rate (FDR) control. We apply SafeGround on multiple GUI grounding models for the challenging ScreenSpot-Pro…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInteractive and Immersive Displays · Adversarial Robustness in Machine Learning · Security and Verification in Computing