OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu, Junlin Yang, Tianbao Xie, Junli Wang, Jiaqi Deng, Xiaole Guo, Yiheng Xu, Chen Henry Wu, Zhennan Shen, Zhuokai Li, Ryan Li, Xiaochuan Li, Junda Chen, Boyuan Zheng, Peihang Li, Fangyu Lei, Ruisheng Cao, Yeqiao Fu, Dongchan Shin, Martin Shin

TL;DR
OpenCUA introduces an open-source framework with datasets, tools, and models to advance research on computer-use agents capable of automating diverse digital tasks across multiple operating systems.
Contribution
It provides the first large-scale computer-use dataset, annotation infrastructure, and scalable models, enabling open research and development of CUAs.
Findings
OpenCUA-72B achieves 45.0% success rate on OSWorld-Verified.
Framework generalizes well across domains.
Models benefit from increased test-time computation.
Abstract
Vision-language models have demonstrated impressive capabilities as computer-use agents (CUAs) capable of automating diverse computer tasks. As their commercial potential grows, critical details of the most capable CUA systems remain closed. As these agents will increasingly mediate digital interactions and execute consequential decisions on our behalf, the research community needs access to open CUA frameworks to study their capabilities, limitations, and risks. To bridge this gap, we propose OpenCUA, a comprehensive open-source framework for scaling CUA data and foundation models. Our framework consists of: (1) an annotation infrastructure that seamlessly captures human computer-use demonstrations; (2) AgentNet, the first large-scale computer-use task dataset spanning 3 operating systems and 200+ applications and websites; (3) a scalable pipeline that transforms demonstrations into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗xlangai/OpenCUA-7Bmodel· 60k dl· ♡ 2960k dl♡ 29
- 🤗xlangai/OpenCUA-32Bmodel· 107 dl· ♡ 27107 dl♡ 27
- 🤗reece124/OpenCUA-32B-convertedmodel· 1 dl1 dl
- 🤗reece124/OpenCUA-7B-convertedmodel· 3 dl3 dl
- 🤗zhiyuanhucs/OpenCUA-7B-vllmmodel· 3 dl3 dl
- 🤗zhiyuanhucs/OpenCUA-7B-vllm-newmodel· 1 dl1 dl
- 🤗zhiyuanhucs/OpenCUA-7B-vllm-0828model· 1 dl1 dl
- 🤗zhiyuanhucs/OpenCUA-32B-vllmmodel· 7 dl· ♡ 17 dl♡ 1
- 🤗manpk-ai/OpenCUA-7Bmodel· 11 dl11 dl
- 🤗manpk-ai/OpenCUA-7B-Copymodel· 13 dl13 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
