IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning
Haohao Luo, Zexi Li, Yuexiang Xie, Wenhao Zhang, Yaliang Li, Ying Shen

TL;DR
IntentRL trains proactive research agents to clarify user intents early, improving efficiency and performance in deep research tasks by using reinforcement learning and a scalable intent refinement pipeline.
Contribution
The paper introduces IntentRL, a novel framework that enhances deep research agents with proactive intent clarification using a two-stage RL training strategy.
Findings
IntentRL outperforms baseline clarify modules and proactive LLMs.
It significantly improves intent hit rate and downstream research performance.
The scalable intent refinement pipeline effectively expands limited research data.
Abstract
Deep Research (DR) agents extend Large Language Models (LLMs) beyond parametric knowledge by autonomously retrieving and synthesizing evidence from large web corpora into long-form reports, enabling a long-horizon agentic paradigm. However, unlike real-time conversational assistants, DR is computationally expensive and time-consuming, creating an autonomy-interaction dilemma: high autonomy on ambiguous user queries often leads to prolonged execution with unsatisfactory outcomes. To address this, we propose IntentRL, a framework that trains proactive agents to clarify latent user intents before starting long-horizon research. To overcome the scarcity of open-ended research data, we introduce a scalable pipeline that expands a few seed samples into high-quality dialogue turns via a shallow-to-deep intent refinement graph. We further adopt a two-stage reinforcement learning (RL) strategy:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Mobile Crowdsensing and Crowdsourcing
