OpenComputer: Verifiable Software Worlds for Computer-Use Agents

Jinbiao Wei; Qianran Ma; Yilun Zhao; Xiao Zhou; Kangqi Ni; Guo Gan; Arman Cohan

arXiv:2605.19769·cs.AI·May 20, 2026

OpenComputer: Verifiable Software Worlds for Computer-Use Agents

Jinbiao Wei, Qianran Ma, Yilun Zhao, Xiao Zhou, Kangqi Ni, Guo Gan, Arman Cohan

PDF

1 Repo

TL;DR

OpenComputer is a framework that creates verifiable, structured software environments for agents, integrating app-specific verifiers, self-improving verification, task synthesis, and evaluation to enhance reliability and auditability.

Contribution

It introduces a comprehensive, verifier-grounded system for constructing and evaluating verifiable software worlds across multiple desktop applications.

Findings

01

Verifiers align more closely with human judgment than LLM-based evaluation.

02

OpenComputer covers 33 applications and 1,000 tasks, demonstrating broad applicability.

03

Open-source models show significant performance drops compared to OSWorld-Verified scores.

Abstract

We present OpenComputer, a verifier-grounded framework for constructing verifiable software worlds for computer-use agents. OpenComputer integrates four components: (1) app-specific state verifiers that expose structured inspection endpoints over real applications, (2) a self-evolving verification layer that improves verifier reliability using execution-grounded feedback, (3) a task-generation pipeline that synthesizes realistic and machine-checkable desktop tasks, and (4) an evaluation harness that records full trajectories and computes auditable partial-credit rewards. In its current form, OpenComputer covers 33 desktop applications and 1,000 finalized tasks spanning browsers, office tools, creative software, development environments, file managers, and communication applications. Experiments show that OpenComputer's hard-coded verifiers align more closely with human adjudication than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

echo0715/OpenComputer
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.