Quantifying and Optimizing Global Faithfulness in Persona-driven Role-playing
Letian Peng, Jingbo Shang

TL;DR
This paper introduces a novel, explainable scoring method called APC for measuring and optimizing faithfulness in persona-driven role-playing AI, improving how AI characters adhere to their given personas.
Contribution
It proposes the APC score, a fine-grained, explainable criterion for assessing and enhancing faithfulness in PRP, validated against human judgment and integrated into optimization techniques.
Findings
APC score correlates highly with human evaluations.
APC-based DPO outperforms existing methods in faithfulness.
The approach scales effectively to large persona datasets.
Abstract
Persona-driven role-playing (PRP) aims to build AI characters that can respond to user queries by faithfully sticking with all persona statements. Unfortunately, existing faithfulness criteria for PRP are limited to coarse-grained LLM-based scoring without a clear definition or formulation. This paper presents a pioneering exploration to quantify PRP faithfulness as a fine-grained and explainable criterion, which also serves as a reliable reference for optimization. Our criterion first discriminates persona statements into active and passive constraints by identifying the query-statement relevance. Then, we incorporate all constraints following the principle that the AI character's response should be (a) entailed by active (relevant) constraints and (b) not contradicted by passive (irrelevant) constraints. We translate this principle mathematically into a novel Active-Passive-Constraint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · AI in Service Interactions · Human-Automation Interaction and Safety
MethodsAttention Is All You Need · Direct Preference Optimization · Linear Layer · Multi-Head Attention · Dense Connections · Position-Wise Feed-Forward Layer · Dropout · Label Smoothing · Residual Connection · Absolute Position Encodings
