Training Proactive and Personalized LLM Agents

Weiwei Sun; Xuhui Zhou; Weihua Du; Xingyao Wang; Sean Welleck; Graham Neubig; Maarten Sap; and Yiming Yang

arXiv:2511.02208·cs.AI·November 5, 2025

Training Proactive and Personalized LLM Agents

Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap, and Yiming Yang

PDF

Open Access 1 Models

TL;DR

This paper presents a multi-objective reinforcement learning approach called PPP that trains LLM agents to optimize productivity, proactivity, and personalization, resulting in more effective and user-adaptive AI agents in interactive tasks.

Contribution

It introduces UserVille, a configurable environment for user simulation, and proposes PPP, a novel multi-objective RL method that jointly optimizes key interaction dimensions for LLM agents.

Findings

01

Agents trained with PPP outperform baselines like GPT-5 by 21.6% on average.

02

PPP enables agents to ask strategic questions and adapt to unseen user preferences.

03

Explicitly optimizing user-centered interaction improves AI agent effectiveness.

Abstract

While existing work focuses primarily on task success, we argue that effective real-world agents require optimizing three dimensions: productivity (task completion), proactivity (asking essential questions), and personalization (adapting to diverse user preferences). We introduce UserVille, an interactive environment with LLM-based user simulators enabling diverse, configurable user preferences. Leveraging UserVille, we introduce PPP, a multi-objective reinforcement learning approach that jointly optimizes all three dimensions: Productivity, Proactivity, and Personalization. Experiments on software engineering and deep research tasks show that agents trained with PPP achieve substantial improvements over strong baselines such as GPT-5 (+21.6 on average), demonstrating the ability to ask strategic clarifying questions, adapt to unseen user preferences, and improve task success through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
sunweiwei/PPP-36B
model· 4 dl· ♡ 1
4 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Spreadsheets and End-User Computing · Software Engineering Research