TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments

Yuheng Lu; Qian Yu; Hongru Wang; Zeming Liu; Wei Su; Yanping Liu; Yuhang Guo; Maocheng Liang; Yunhong Wang; Haifeng Wang

arXiv:2505.17629·cs.HC·May 28, 2025

TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments

Yuheng Lu, Qian Yu, Hongru Wang, Zeming Liu, Wei Su, Yanping Liu, Yuhang Guo, Maocheng Liang, Yunhong Wang, Haifeng Wang

PDF

1 Repo

TL;DR

This paper introduces TransBench, a comprehensive benchmark for evaluating and improving the transferability of GUI agents across different platforms, versions, and applications, addressing key challenges in dynamic digital environments.

Contribution

We present TransBench, the first benchmark to systematically assess and enhance GUI agent transferability across multiple dimensions and diverse app categories.

Findings

01

Significant improvements in grounding accuracy with our methods.

02

TransBench effectively evaluates cross-version, cross-platform, and cross-application transferability.

03

Our code and data are publicly available for further research.

Abstract

Graphical User Interface (GUI) agents, which autonomously operate on digital interfaces through natural language instructions, hold transformative potential for accessibility, automation, and user experience. A critical aspect of their functionality is grounding - the ability to map linguistic intents to visual and structural interface elements. However, existing GUI agents often struggle to adapt to the dynamic and interconnected nature of real-world digital environments, where tasks frequently span multiple platforms and applications while also being impacted by version updates. To address this, we introduce TransBench, the first benchmark designed to systematically evaluate and enhance the transferability of GUI agents across three key dimensions: cross-version transferability (adapting to version updates), cross-platform transferability (generalizing across platforms like iOS,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

buaa-irip-llm/transbench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.