CUA-Skill: Develop Skills for Computer Using Agent
Tianyi Chen, Yinheng Li, Michael Solodko, Sen Wang, Nan Jiang, Tingyuan Cui, Junheng Hao, Jongwoo Ko, Sara Abdali, Leon Xu, Suzhen Zheng, Hao Fan, Pashmina Cameron, Justin Wagle, Kazuhito Koishida

TL;DR
CUA-Skill introduces a structured, reusable skill library for computer-using agents, significantly enhancing their success rates and robustness in automating Windows tasks, and providing a scalable infrastructure for future development.
Contribution
The paper presents CUA-Skill, a large-scale, structured skill library for computer-using agents, enabling more reliable and scalable automation of Windows applications.
Findings
Achieves 57.5% success rate on WindowsAgentArena, outperforming prior methods.
Substantially improves robustness and execution success in end-to-end benchmarks.
Provides a scalable infrastructure for future agent development.
Abstract
Computer-Using Agents (CUAs) aim to autonomously operate computer systems to complete real-world tasks. However, existing agentic systems remain difficult to scale and lag behind human performance. A key limitation is the absence of reusable and structured skill abstractions that capture how humans interact with graphical user interfaces and how to leverage these skills. We introduce CUA-Skill, a computer-using agentic skill base that encodes human computer-use knowledge as skills coupled with parameterized execution and composition graphs. CUA-Skill is a large-scale library of carefully engineered skills spanning common Windows applications, serving as a practical infrastructure and tool substrate for scalable, reliable agent development. Built upon this skill base, we construct CUA-Skill Agent, an end-to-end computer-using agent that supports dynamic skill retrieval, argument…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Topic Modeling · Multimodal Machine Learning Applications
