RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Hongli Zhou; Hui Huang; Wei Liu; Chenglong Wang; Xingyuan Bu; Lvyuan Han; Fuhai Song; Muyun Yang; Wenhao Jiang; Hailong Cao; Tiejun Zhao

arXiv:2601.14032·cs.CL·January 21, 2026

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Hongli Zhou, Hui Huang, Wei Liu, Chenglong Wang, Xingyuan Bu, Lvyuan Han, Fuhai Song, Muyun Yang, Wenhao Jiang, Hailong Cao, Tiejun Zhao

PDF

Open Access

TL;DR

This paper introduces RM-Distiller, a novel framework that leverages the multifaceted capabilities of generative LLMs to improve reward model distillation, leading to better alignment with human preferences.

Contribution

RM-Distiller systematically exploits the refinement, scoring, and generation capabilities of teacher LLMs for enhanced reward model distillation, a novel approach in the field.

Findings

01

Outperforms traditional distillation methods on RM benchmarks

02

Improves reinforcement learning-based alignment results

03

Demonstrates the importance of multifaceted teacher capabilities

Abstract

Reward models (RMs) play a pivotal role in aligning large language models (LLMs) with human preferences. Due to the difficulty of obtaining high-quality human preference annotations, distilling preferences from generative LLMs has emerged as a standard practice. However, existing approaches predominantly treat teacher models as simple binary annotators, failing to fully exploit the rich knowledge and capabilities for RM distillation. To address this, we propose RM-Distiller, a framework designed to systematically exploit the multifaceted capabilities of teacher LLMs: (1) Refinement capability, which synthesizes highly correlated response pairs to create fine-grained and contrastive signals. (2) Scoring capability, which guides the RM in capturing precise preference strength via a margin-aware optimization objective. (3) Generation capability, which incorporates the teacher's generative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Recommender Systems and Techniques