GUIGuard-Bench: Toward a General Evaluation for Privacy-Preserving GUI Agents

Yanxi Wang; Zhiling Zhang; Wenbo Zhou; Weiming Zhang; Jie Zhang; Qiannan Zhu; Yu Shi; Shuxin Zheng; and Jiyan He

arXiv:2601.18842·cs.CR·May 14, 2026

GUIGuard-Bench: Toward a General Evaluation for Privacy-Preserving GUI Agents

Yanxi Wang, Zhiling Zhang, Wenbo Zhou, Weiming Zhang, Jie Zhang, Qiannan Zhu, Yu Shi, Shuxin Zheng, and Jiyan He

PDF

1 Repo

TL;DR

GUIGuard-Bench introduces a novel benchmark dataset with annotated GUI trajectories to evaluate privacy-preserving strategies in GUI agents, highlighting current models' strengths and limitations in privacy recognition and task utility.

Contribution

It provides the first trajectory-based privacy benchmark for GUI agents, supporting multiple evaluation tasks and revealing key challenges in privacy detection and protection.

Findings

01

Models can detect private information presence but struggle with localization and categorization.

02

Closed-source models like Claude Sonnet 4.6 maintain task semantics after privacy protection.

03

Privacy recognition remains a critical bottleneck for practical GUI agents.

Abstract

As GUI agents increasingly rely on screenshots to perceive and operate digital environments, they may inadvertently expose sensitive information such as identities, accounts, locations, and behavioral traces. While existing benchmarks primarily focus on task completion, grounding, or defenses against third-party attacks, current visual privacy datasets remain largely restricted to static natural images, limiting their ability to capture the contextual dependence and task relevance of privacy risks in GUI task trajectories. To bridge this gap, we introduce \textbf{GUIGuard-Bench}, a first-step benchmark for studying privacy-preserving GUI agents in trajectory-based GUI workflows. GUIGuard-Bench contains 241 real GUI-agent trajectories with 4,080 screenshots across Android and PC environments. Each screenshot is annotated at the region level with privacy bounding boxes, semantic privacy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://futuresis.github.io/GUIGuard-page
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.