RoleRMBench & RoleRM: Towards Reward Modeling for Profile-Based Role Play in Dialogue Systems

Hang Ding; Qiming Feng; Dongqi Liu; Qi Zhao; Tao Yao; Shuo Wang; Dongsheng Chen; Jian Li; Zhenye Gan; Jiangning Zhang; Chengjie Wang; Yabiao Wang

arXiv:2512.10575·cs.CL·December 12, 2025

RoleRMBench & RoleRM: Towards Reward Modeling for Profile-Based Role Play in Dialogue Systems

Hang Ding, Qiming Feng, Dongqi Liu, Qi Zhao, Tao Yao, Shuo Wang, Dongsheng Chen, Jian Li, Zhenye Gan, Jiangning Zhang, Chengjie Wang, Yabiao Wang

PDF

Open Access

TL;DR

This paper introduces RoleRMBench, a benchmark for reward modeling in role-playing dialogue, and proposes RoleRM, a new reward model trained with continuous implicit preferences, significantly improving alignment with human judgments.

Contribution

The paper presents the first benchmark for reward modeling in role play and introduces RoleRM, a novel reward model utilizing continuous implicit preferences for better subjective alignment.

Findings

01

RoleRM outperforms existing reward models by over 24% on average.

02

Large gaps exist between general reward models and human judgments in role play.

03

Continuous implicit preferences improve subjective evaluation consistency.

Abstract

Reward modeling has become a cornerstone of aligning large language models (LLMs) with human preferences. Yet, when extended to subjective and open-ended domains such as role play, existing reward models exhibit severe degradation, struggling to capture nuanced and persona-grounded human judgments. To address this gap, we introduce RoleRMBench, the first systematic benchmark for reward modeling in role-playing dialogue, covering seven fine-grained capabilities from narrative management to role consistency and engagement. Evaluation on RoleRMBench reveals large and consistent gaps between general-purpose reward models and human judgment, particularly in narrative and stylistic dimensions. We further propose RoleRM, a reward model trained with Continuous Implicit Preferences (CIP), which reformulates subjective evaluation as continuous consistent pairwise supervision under multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems