Unlocking Proactivity in Task-Oriented Dialogue

Hongbin Zhang; Ning Gao; Yuqin Dai; Ruiyuan Wu; Jinpeng Wang; Rena Wei Gao; Bingdong Tan; Shuzheng Gao; Zongjie Li; Chaozheng Wang

arXiv:2605.22240·cs.AI·May 22, 2026

Unlocking Proactivity in Task-Oriented Dialogue

Hongbin Zhang, Ning Gao, Yuqin Dai, Ruiyuan Wu, Jinpeng Wang, Rena Wei Gao, Bingdong Tan, Shuzheng Gao, Zongjie Li, Chaozheng Wang

PDF

TL;DR

This paper introduces a novel approach to enhance proactive behavior in task-oriented dialogue systems by modeling user concerns and training policies with concern-aware simulation, leading to more persuasive agents.

Contribution

It proposes the Cognitive User Simulator and concern-conditioned policy optimization methods to improve proactivity in dialogue agents beyond traditional reward-shaping techniques.

Findings

01

Concern conditioning significantly increases agent proactivity.

02

The simulator produces diverse, realistic user interactions.

03

Concern-aware training improves persuasion success rates.

Abstract

Proactive task-oriented dialogue (TOD), such as outbound sales, demands a persuasive agent that actively probes the user's concerns and steers the conversation toward acceptance within a bounded number of turns. Yet post-trained LLMs are inherently conservative, and reward-shaping RL (e.g., GRPO) struggles since it only re-weights what an already passive policy samples. We show that conditioning on the user's latent concerns unlocks proactive capability that no amount of sampling can undermine, establishing these concerns as a pivotal training-time signal. To operationalize this finding, we build the \textbf{Cognitive User Simulator}, which models each user as a stratified persona comprising observable external traits and hidden internal concerns. The simulator produces faithful and diverse interactions, while emitting per-turn state dynamics that track persuasion progress. We then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.