UI-Evol: Automatic Knowledge Evolving for Computer Use Agents

Ziyun Zhang; Xinyi Liu; Xiaoyi Zhang; Jun Wang; Gang Chen; Yan Lu

arXiv:2505.21964·cs.HC·November 4, 2025

UI-Evol: Automatic Knowledge Evolving for Computer Use Agents

Ziyun Zhang, Xinyi Liu, Xiaoyi Zhang, Jun Wang, Gang Chen, Yan Lu

PDF

Open Access

TL;DR

UI-Evol is a modular system that enhances computer use agents by evolving GUI knowledge through interaction data and external references, significantly improving task success rates and reliability.

Contribution

It introduces a novel plug-and-play knowledge evolution module with retrace and critique stages, addressing the knowledge-execution gap in computer use agents.

Findings

01

UI-Evol significantly improves task performance on OSWorld benchmark.

02

It reduces behavioral standard deviation, increasing agent reliability.

03

Demonstrates effectiveness over state-of-the-art Agent S2.

Abstract

External knowledge has played a crucial role in the recent development of computer use agents. We identify a critical knowledge-execution gap: retrieved knowledge often fails to translate into effective real-world task execution. Our analysis shows even 90% correct knowledge yields only 41% execution success rate. To bridge this gap, we propose UI-Evol, a plug-and-play module for autonomous GUI knowledge evolution. UI-Evol consists of two stages: a Retrace Stage that extracts faithful objective action sequences from actual agent-environment interactions, and a Critique Stage that refines existing knowledge by comparing these sequences against external references. We conduct comprehensive experiments on the OSWorld benchmark with the state-of-the-art Agent S2. Our results demonstrate that UI-Evol not only significantly boosts task performance but also addresses a previously overlooked…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Personal Information Management and User Behavior

MethodsRetrace