ProAgentBench: Evaluating LLM Agents for Proactive Assistance with Real-World Data

Yuanbo Tang; Huaze Tang; Tingyu Cao; Lam Nguyen; Anping Zhang; Xinwen Cao; Chunkang Liu; Wenbo Ding; Yang Li

arXiv:2602.04482·cs.HC·February 11, 2026

ProAgentBench: Evaluating LLM Agents for Proactive Assistance with Real-World Data

Yuanbo Tang, Huaze Tang, Tingyu Cao, Lam Nguyen, Anping Zhang, Xinwen Cao, Chunkang Liu, Wenbo Ding, Yang Li

PDF

Open Access

TL;DR

ProAgentBench is a new benchmark with real user data and a hierarchical framework for evaluating proactive AI assistants, addressing limitations of synthetic datasets and isolated task focus in existing research.

Contribution

It introduces a hierarchical task framework, a large real-world dataset, and comprehensive evaluations for proactive AI agents in continuous workflows.

Findings

01

Long-term memory improves prediction accuracy.

02

Real-world data outperforms synthetic data.

03

Historical context enhances proactive assistance.

Abstract

Proactive agents that anticipate user intentions without explicit prompts represent a significant evolution in human-AI interaction, promising to reduce cognitive load and streamline workflows. However, existing datasets suffer from two critical deficiencies: (1) reliance on LLM-synthesized data that fails to capture authentic human decision-making patterns, and (2) focus on isolated tasks rather than continuous workflows, missing the pre-assistance behavioral context essential for learning proactive intervention signals. To address these gaps, we introduce ProAgentBench, a rigorous benchmark for proactive agents in working scenarios. Our contributions include: (1) a hierarchical task framework that decomposes proactive assistance into timing prediction and assist content generation; (2) a privacy-compliant dataset with 28,000+ events from 500+ hours of real user sessions, preserving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersonal Information Management and User Behavior · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education