ReflectRM: Boosting Generative Reward Models via Self-Reflection within a Unified Judgment Framework

Kai Qin; Liangxin Liu; Yu Liang; Longzheng Wang; Yan Wang; Yueyang Zhang; Long Xia; Zhiyuan Sun; Houde Liu; Daiting Shi

arXiv:2604.07506·cs.AI·April 21, 2026

ReflectRM: Boosting Generative Reward Models via Self-Reflection within a Unified Judgment Framework

Kai Qin, Liangxin Liu, Yu Liang, Longzheng Wang, Yan Wang, Yueyang Zhang, Long Xia, Zhiyuan Sun, Houde Liu, Daiting Shi

PDF

1 Repo

TL;DR

ReflectRM introduces a self-reflective generative reward model that jointly assesses response and analysis preferences, significantly improving alignment accuracy and reducing positional bias in large language model evaluation.

Contribution

It presents a novel unified framework for generative reward modeling that incorporates self-reflection to enhance preference assessment and model stability.

Findings

01

Achieves +3.7 accuracy gain on Qwen3-4B benchmark.

02

Substantially reduces positional bias by +10.2 points.

03

Response and analysis preferences mutually reinforce each other.

Abstract

Reward Models (RMs) are critical components in the Reinforcement Learning from Human Feedback (RLHF) pipeline, directly determining the alignment quality of Large Language Models (LLMs). Recently, Generative Reward Models (GRMs) have emerged as a superior paradigm, offering higher interpretability and stronger generalization than traditional scalar RMs. However, existing methods for GRMs focus primarily on outcome-level supervision, neglecting analytical process quality, which constrains their potential. To address this, we propose ReflectRM, a novel GRM that leverages self-reflection to assess analytical quality and enhance preference modeling. ReflectRM is trained under a unified generative framework for joint modeling of response preference and analysis preference. During inference, we use its self-reflection capability to identify the most reliable analysis, from which the final…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuliangCarmelo/ReflectRM
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.