Scaling Reward Modeling without Human Supervision

Jingxuan Fan; Yueying Li; Zhenting Qi; Dinghuai Zhang; Kiant\'e Brantley; Sham M. Kakade; Hanlin Zhang

arXiv:2603.02225·cs.LG·March 17, 2026

Scaling Reward Modeling without Human Supervision

Jingxuan Fan, Yueying Li, Zhenting Qi, Dinghuai Zhang, Kiant\'e Brantley, Sham M. Kakade, Hanlin Zhang

PDF

Open Access

TL;DR

This paper demonstrates that reward models can be scaled effectively without human supervision by using unsupervised preference learning on large web corpora, leading to significant improvements in downstream tasks.

Contribution

It introduces an unsupervised reward modeling approach called reward-based scaling (RBS), showing its effectiveness across various model scales and tasks without human annotations.

Findings

01

Achieved up to +7.7 points improvement on RewardBench v2 accuracy.

02

Demonstrated transferability across different model families and scales.

03

Matched or exceeded supervised reward model performance in downstream tasks.

Abstract

Learning from feedback is an instrumental process for advancing the capabilities and safety of frontier models, yet its effectiveness is often constrained by cost and scalability. We present a pilot study that explores scaling reward models through unsupervised approaches. We operationalize reward-based scaling (RBS), in its simplest form, as preference learning over document prefixes and suffixes drawn from large-scale web corpora. Its advantage is demonstrated in various aspects: despite using no human annotations, training on 11M tokens of math-focused web data yields steady gains on RewardBench v1 and v2, and these improvements consistently transfer across diverse initialization backbones spanning model families and scales. Across models, our method improves RewardBench v2 accuracy by up to +7.7 points on average, with gains of up to +16.1 on in-domain math subsets and consistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification