# ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents

**Authors:** Tianjian Liu, Fanqi Wan, Jiajian Guo, Xiaojun Quan

arXiv: 2508.20973 · 2025-08-29

## TL;DR

ProactiveEval is a comprehensive framework for evaluating proactive dialogue abilities of large language models, covering target planning and dialogue guidance across multiple domains, and facilitating automatic generation of evaluation data.

## Contribution

The paper introduces ProactiveEval, a unified evaluation framework that decomposes proactive dialogue into key components and enables cross-domain assessment and data generation.

## Key findings

- DeepSeek-R1 excels in target planning
- Claude-3.7-Sonnet performs well in dialogue guidance
- Reasoning skills impact proactive dialogue performance

## Abstract

Proactive dialogue has emerged as a critical and challenging research problem in advancing large language models (LLMs). Existing works predominantly focus on domain-specific or task-oriented scenarios, which leads to fragmented evaluations and limits the comprehensive exploration of models' proactive conversation abilities. In this work, we propose ProactiveEval, a unified framework designed for evaluating proactive dialogue capabilities of LLMs. This framework decomposes proactive dialogue into target planning and dialogue guidance, establishing evaluation metrics across various domains. Moreover, it also enables the automatic generation of diverse and challenging evaluation data. Based on the proposed framework, we develop 328 evaluation environments spanning 6 distinct domains. Through experiments with 22 different types of LLMs, we show that DeepSeek-R1 and Claude-3.7-Sonnet exhibit exceptional performance on target planning and dialogue guidance tasks, respectively. Finally, we investigate how reasoning capabilities influence proactive behaviors and discuss their implications for future model development.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20973/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20973/full.md

## References

53 references — full list in the complete paper: https://tomesphere.com/paper/2508.20973/full.md

---
Source: https://tomesphere.com/paper/2508.20973