ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Fei Tang; Zhiqiong Lu; Boxuan Zhang; Weiming Lu; Jun Xiao; Yueting Zhuang; Yongliang Shen

arXiv:2604.11784·cs.LG·April 14, 2026

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Fei Tang, Zhiqiong Lu, Boxuan Zhang, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen

PDF

1 Repo

TL;DR

ClawGUI is an open-source framework that unifies training, evaluation, and deployment of GUI agents across virtual and real devices, improving reproducibility and real-world applicability.

Contribution

It introduces a comprehensive full-stack infrastructure for GUI agents, including RL training, standardized evaluation, and deployment on multiple platforms.

Findings

01

Achieves 95.8% reproduction accuracy across benchmarks.

02

ClawGUI-2B outperforms baseline with 17.1% success rate on MobileWorld.

03

Supports deployment on Android, HarmonyOS, and iOS with multi-platform compatibility.

Abstract

GUI agents drive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet progress in this area is bottlenecked less by modeling capacity than by the absence of a coherent full-stack infrastructure: online RL training suffers from environment instability and closed pipelines, evaluation protocols drift silently across works, and trained agents rarely reach real users on real devices. We present \textbf{ClawGUI}, an open-source framework addressing these three gaps within a single harness. \textbf{ClawGUI-RL} provides the first open-source GUI agent RL infrastructure with validated support for both parallel virtual environments and real physical devices, integrating GiGPO with a Process Reward Model for dense step-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zju-real/ClawGUI
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.