Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
Pedro Henrique Luz de Araujo, Michael A. Hedderich, Ali Modarressi, Hinrich Schuetze, Benjamin Roth

TL;DR
This paper presents a new evaluation protocol for long, multi-round dialogues with persona-based LLMs, revealing that persona fidelity diminishes over time and highlighting challenges in maintaining safety and instruction adherence.
Contribution
It introduces a robust long-dialogue evaluation framework and systematically studies how dialogue length impacts persona fidelity, instruction following, and safety in state-of-the-art LLMs.
Findings
Persona fidelity degrades over extended dialogues.
A trade-off exists between persona fidelity and instruction following.
Baseline models sometimes outperform persona-assigned models in early dialogue stages.
Abstract
Persona-assigned large language models (LLMs) are used in domains such as education, healthcare, and sociodemographic simulation. Yet, they are typically evaluated only in short, single-round settings that do not reflect real-world usage. We introduce an evaluation protocol that combines long persona dialogues (over 100 rounds) and evaluation datasets to create dialogue-conditioned benchmarks that can robustly measure long-context effects. We then investigate the effects of dialogue length on persona fidelity, instruction-following, and safety of seven state-of-the-art open- and closed-weight LLMs. We find that persona fidelity degrades over the course of dialogues, especially in goal-oriented conversations, where models must sustain both persona fidelity and instruction following. We identify a trade-off between fidelity and instruction following, with non-persona baselines initially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPersona Design and Applications · Social Robot Interaction and HRI · AI in Service Interactions
