Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Tao Ge; Baolin Peng; Hao Cheng; Jianfeng Gao

arXiv:2604.28181·cs.AI·May 1, 2026

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Tao Ge, Baolin Peng, Hao Cheng, Jianfeng Gao

PDF

1 Datasets

TL;DR

This paper introduces a scalable method for creating realistic synthetic computer environments to simulate long-term productivity tasks, enabling extensive agent training and evaluation.

Contribution

The authors present a novel scalable approach to generate realistic synthetic computer environments for long-horizon productivity simulations.

Findings

01

Created 1,000 synthetic computers with detailed environments.

02

Simulations span over 2,000 turns and require 8+ hours of runtime each.

03

Results show significant improvements in agent performance on productivity tasks.

Abstract

Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content-rich artifacts. To scale synthetic data creation for such productivity scenarios, we introduce Synthetic Computers at Scale, a scalable methodology for creating such environments with realistic folder hierarchies and content-rich artifacts (e.g., documents, spreadsheets, and presentations). Conditioned on each synthetic computer, we run long-horizon simulations: one agent creates productivity objectives that are specific to the computer's user and require multiple professional deliverables and about a month of human work; another agent then acts as that user and keeps working across the computer -- for example, navigating the filesystem for grounding, coordinating with simulated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

microsoft/synthetic-computers-at-scale
dataset· 1.9k dl
1.9k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.