Act-Adaptive Margin: Dynamically Calibrating Reward Models for Subjective Ambiguity

Feiteng Fang; Dingwei Chen; Xiang Huang; Ting-En Lin; Yuchuan Wu; Xiong Liu; Xinge Ye; Ziqiang Liu; Haonan Zhang; Liang Zhu; Hamid Alinejad-Rokny; Min Yang; Yongbin Li

arXiv:2505.23923·cs.CL·January 9, 2026

Act-Adaptive Margin: Dynamically Calibrating Reward Models for Subjective Ambiguity

Feiteng Fang, Dingwei Chen, Xiang Huang, Ting-En Lin, Yuchuan Wu, Xiong Liu, Xinge Ye, Ziqiang Liu, Haonan Zhang, Liang Zhu, Hamid Alinejad-Rokny, Min Yang, Yongbin Li

PDF

Open Access

TL;DR

This paper introduces Act-Adaptive Margin (AAM), a method that dynamically calibrates reward models to better handle subjective ambiguity, significantly improving performance in subjective reward modeling and alignment tasks.

Contribution

AAM provides a novel way to calibrate preference margins dynamically, enhancing reward models' ability to handle subjective ambiguity without extra human annotation.

Findings

01

AAM improves Bradley-Terry reward models by 2.95% in general tasks.

02

AAM enhances performance by 4.85% in subjective role-playing tasks.

03

Reward models with AAM achieve state-of-the-art results on CharacterEval and Charm.

Abstract

Currently, most reinforcement learning tasks focus on domains like mathematics and programming, where verification is relatively straightforward. However, in subjective tasks such as role-playing, alignment techniques struggle to make progress, primarily because subjective reward modeling using the Bradley-Terry model faces significant challenges when dealing with ambiguous preferences. To improve reward modeling in subjective tasks, this paper proposes AAM (\textbf{\underline{A}}ct-\textbf{\underline{A}}daptive \textbf{\underline{M}}argin), which enhances reward modeling by dynamically calibrating preference margins using the model's internal parameter knowledge. We design two versions of AAM that efficiently generate contextually-appropriate preference gaps without additional human annotation. This approach fundamentally improves how reward models handle subjective rewards by better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling