Training Proactive and Personalized LLM Agents
Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap, and Yiming Yang

TL;DR
This paper presents a multi-objective reinforcement learning approach called PPP that trains LLM agents to optimize productivity, proactivity, and personalization, resulting in more effective and user-adaptive AI agents in interactive tasks.
Contribution
It introduces UserVille, a configurable environment for user simulation, and proposes PPP, a novel multi-objective RL method that jointly optimizes key interaction dimensions for LLM agents.
Findings
Agents trained with PPP outperform baselines like GPT-5 by 21.6% on average.
PPP enables agents to ask strategic questions and adapt to unseen user preferences.
Explicitly optimizing user-centered interaction improves AI agent effectiveness.
Abstract
While existing work focuses primarily on task success, we argue that effective real-world agents require optimizing three dimensions: productivity (task completion), proactivity (asking essential questions), and personalization (adapting to diverse user preferences). We introduce UserVille, an interactive environment with LLM-based user simulators enabling diverse, configurable user preferences. Leveraging UserVille, we introduce PPP, a multi-objective reinforcement learning approach that jointly optimizes all three dimensions: Productivity, Proactivity, and Personalization. Experiments on software engineering and deep research tasks show that agents trained with PPP achieve substantial improvements over strong baselines such as GPT-5 (+21.6 on average), demonstrating the ability to ask strategic clarifying questions, adapt to unseen user preferences, and improve task success through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Spreadsheets and End-User Computing · Software Engineering Research
