PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants
Zheng Zhao, Clara Vania, Subhradeep Kayal, Naila Khan, Shay B. Cohen, Emine Yilmaz

TL;DR
PersonaLens is a new benchmark designed to evaluate how well conversational AI assistants personalize interactions in task-oriented settings, addressing limitations of existing benchmarks that focus on non-task domains.
Contribution
We introduce PersonaLens, a comprehensive benchmark with diverse user profiles and specialized agents to systematically assess personalization in task-oriented AI assistants.
Findings
Current LLM assistants show high variability in personalization capabilities.
PersonaLens reveals gaps in personalization across different tasks.
Benchmark facilitates targeted improvements in conversational AI personalization.
Abstract
Large language models (LLMs) have advanced conversational AI assistants. However, systematically evaluating how well these assistants apply personalization--adapting to individual user preferences while completing tasks--remains challenging. Existing personalization benchmarks focus on chit-chat, non-conversational tasks, or narrow domains, failing to capture the complexities of personalized task-oriented assistance. To address this, we introduce PersonaLens, a comprehensive benchmark for evaluating personalization in task-oriented AI assistants. Our benchmark features diverse user profiles equipped with rich preferences and interaction histories, along with two specialized LLM-based agents: a user agent that engages in realistic task-oriented dialogues with AI assistants, and a judge agent that employs the LLM-as-a-Judge paradigm to assess personalization, response quality, and task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Persona Design and Applications · Artificial Intelligence in Healthcare and Education
MethodsFocus
