Gym-Anything: Turn any Software into an Agent Environment

Pranjal Aggarwal; Graham Neubig; Sean Welleck

arXiv:2604.06126·cs.LG·April 8, 2026

Gym-Anything: Turn any Software into an Agent Environment

Pranjal Aggarwal, Graham Neubig, Sean Welleck

PDF

1 Repo

TL;DR

Gym-Anything is a framework that converts any software into an interactive environment for training and evaluating computer-use agents, enabling scalable, long-horizon tasks across diverse domains.

Contribution

It introduces a novel pipeline for automatic environment creation from software, producing a large, diverse benchmark dataset and improving agent performance through multi-agent auditing.

Findings

01

Created CUA-World with 10K+ tasks across multiple domains.

02

Developed a vision-language model that outperforms larger models on the benchmark.

03

Enhanced agent performance by applying auditing feedback at test time.

Abstract

Computer-use agents hold the promise of assisting in a wide range of digital economic activities. However, current research has largely focused on short-horizon tasks over a limited set of software with limited economic value, such as basic e-commerce and OS-configuration tasks. A key reason is that creating environments for complex software requires significant time and human effort, and therefore does not scale. To address this, we introduce Gym-Anything, a framework for converting any software into an interactive computer-use environment. We frame environment creation itself as a multi-agent task: a coding agent writes setup scripts, downloads real-world data, and configures the software, while producing evidence of correct setup. An independent audit agent then verifies evidence for the environment setup against a quality checklist. Using a taxonomy of economically valuable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.