Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Deepak Nathani; Cheng Zhang; Chang Huan; Jiaming Shan; Yinfei Yang; Alkesh Patel; Zhe Gan; William Yang Wang; Michael Saxon; Xin Eric Wang

arXiv:2604.00842·cs.AI·April 2, 2026

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Deepak Nathani, Cheng Zhang, Chang Huan, Jiaming Shan, Yinfei Yang, Alkesh Patel, Zhe Gan, William Yang Wang, Michael Saxon, Xin Eric Wang

PDF

1 Repo

TL;DR

Proactive Agent Research Environment (Pare) is a framework that simulates realistic user interactions in digital environments to evaluate proactive assistants, addressing limitations of existing flat API models.

Contribution

It introduces Pare, a stateful user simulation framework, and Pare-Bench, a diverse task benchmark for testing proactive agent capabilities.

Findings

01

Pare enables active user simulation with stateful models.

02

Pare-Bench includes 143 diverse tasks across multiple app domains.

03

The framework facilitates realistic evaluation of proactive assistants.

Abstract

Proactive agents that anticipate user needs and autonomously execute tasks hold great promise as digital assistants, yet the lack of realistic user simulation frameworks hinders their development. Existing approaches model apps as flat tool-calling APIs, failing to capture the stateful and sequential nature of user interaction in digital environments and making realistic user simulation infeasible. We introduce Proactive Agent Research Environment (Pare), a framework for building and evaluating proactive agents in digital environments. Pare models applications as finite state machines with stateful navigation and state-dependent action space for the user simulator, enabling active user simulation. Building on this foundation, we present Pare-Bench, a benchmark of 143 diverse tasks spanning communication, productivity, scheduling, and lifestyle apps, designed to test context observation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deepakn97/pare
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.