PersoBench: Benchmarking Personalized Response Generation in Large Language Models
Saleh Afzoon, Zahra Jamali, Usman Naseem, Amin Beheshti

TL;DR
This paper introduces PersoBench, an automated benchmarking pipeline to evaluate the personalization ability of large language models in persona-aware dialogue generation, revealing current models' strengths and weaknesses.
Contribution
The paper presents a novel automated benchmarking framework, PersoBench, for assessing LLMs' personalization in dialogue, addressing a gap in existing evaluation methods.
Findings
LLMs generate fluent and diverse responses effectively.
Current LLMs struggle with personalization and coherence.
Evaluation across multiple models and datasets highlights these limitations.
Abstract
While large language models (LLMs) have exhibited impressive conversational capabilities, their proficiency in delivering personalized responses remains unclear. Although recent benchmarks automatically evaluate persona consistency in role-playing contexts using LLM-based judgment, the evaluation of personalization in response generation remains underexplored. To address this gap, we present an automated benchmarking pipeline, PersoBench, to evaluate the personalization ability of LLMs in persona-aware dialogue generation within a zero-shot setting. Our framework employs a structured pipeline comprising speaker-aware annotation, task-specific and context-driven prompt construction, response post-processing, and automated evaluation across multiple dimensions of generation quality. In particular, the pipeline performs text preprocessing and speaker labeling, constructs structured prompts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
