Auto-Rubric: Learning From Implicit Weights to Explicit Rubrics for Reward Modeling

Lipeng Xie; Sen Huang; Zhuo Zhang; Anni Zou; Yunpeng Zhai; Dingchao Ren; Kezun Zhang; Haoyuan Hu; Boyin Liu; Haoran Chen; Zhaoyang Liu; Bolin Ding

arXiv:2510.17314·cs.LG·February 6, 2026

Auto-Rubric: Learning From Implicit Weights to Explicit Rubrics for Reward Modeling

Lipeng Xie, Sen Huang, Zhuo Zhang, Anni Zou, Yunpeng Zhai, Dingchao Ren, Kezun Zhang, Haoyuan Hu, Boyin Liu, Haoran Chen, Zhaoyang Liu, Bolin Ding

PDF

Open Access 1 Datasets

TL;DR

This paper introduces a novel approach to reward modeling that uses explicit, hierarchical rubrics derived from iterative, verification-driven refinement, outperforming traditional neural weight-based models with less data.

Contribution

It presents a training-free framework for learning explicit rubrics from preference data, enabling interpretable and effective reward functions without gradient descent.

Findings

01

Outperforms fully trained reward models on multiple benchmarks.

02

Achieves 80.91% on RewardBench2 with only 70 preference pairs.

03

Demonstrates high compressibility and interpretability of reward signals.

Abstract

Conventional reward modeling relies on gradient descent over neural weights, creating opaque, data-hungry "black boxes." We propose a paradigm shift from implicit to explicit reward parameterization, recasting optimization from continuous weight spaces to the discrete space of natural language rubrics. We introduce a training-free framework based on iterative rubric learning: it locally induces discriminative criteria via verification-driven refinement, and globally compresses the candidate criteria pool into a compact core set by maximizing an information-theoretic coding rate objective. We organize the compressed core set into a hierarchical rubric structure -- high-level evaluation dimensions supported by concrete verification checks -- serving as an interpretable, portable reward function. Empirically, our approach challenges prevailing data scaling assumptions: using only 70…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

agentscope-ai/Auto-Rubric
dataset· 39 dl
39 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Domain Adaptation and Few-Shot Learning