ProAgentBench: Evaluating LLM Agents for Proactive Assistance with Real-World Data
Yuanbo Tang, Huaze Tang, Tingyu Cao, Lam Nguyen, Anping Zhang, Xinwen Cao, Chunkang Liu, Wenbo Ding, Yang Li

TL;DR
ProAgentBench is a new benchmark with real user data and a hierarchical framework for evaluating proactive AI assistants, addressing limitations of synthetic datasets and isolated task focus in existing research.
Contribution
It introduces a hierarchical task framework, a large real-world dataset, and comprehensive evaluations for proactive AI agents in continuous workflows.
Findings
Long-term memory improves prediction accuracy.
Real-world data outperforms synthetic data.
Historical context enhances proactive assistance.
Abstract
Proactive agents that anticipate user intentions without explicit prompts represent a significant evolution in human-AI interaction, promising to reduce cognitive load and streamline workflows. However, existing datasets suffer from two critical deficiencies: (1) reliance on LLM-synthesized data that fails to capture authentic human decision-making patterns, and (2) focus on isolated tasks rather than continuous workflows, missing the pre-assistance behavioral context essential for learning proactive intervention signals. To address these gaps, we introduce ProAgentBench, a rigorous benchmark for proactive agents in working scenarios. Our contributions include: (1) a hierarchical task framework that decomposes proactive assistance into timing prediction and assist content generation; (2) a privacy-compliant dataset with 28,000+ events from 500+ hours of real user sessions, preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersonal Information Management and User Behavior · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
