Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric

Ruipeng Jia; Yunyi Yang; Yuxin Wu; Yongbo Gai; Siyuan Tao; Mengyu Zhou; Jianhe Lin; Xiaoxi Jiang; Guanjun Jiang

arXiv:2602.14069·cs.CL·March 2, 2026

Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric

Ruipeng Jia, Yunyi Yang, Yuxin Wu, Yongbo Gai, Siyuan Tao, Mengyu Zhou, Jianhe Lin, Xiaoxi Jiang, Guanjun Jiang

PDF

Open Access

TL;DR

The paper introduces Open Rubric System (OpenRS), a framework for scalable reinforcement learning that uses explicit, adaptive rubrics and principles-based judgments to improve alignment and interpretability in open-ended tasks.

Contribution

It proposes a novel rubric-based, principle-driven approach with adaptive and verifiable components, replacing scalar reward models for better alignment and robustness.

Findings

01

OpenRS improves discriminability in open-ended tasks.

02

The system enables explicit, inspectable reasoning processes.

03

It effectively combines human and automated refinement of principles.

Abstract

Scalar reward models compress multi-dimensional human preferences into a single opaque score, creating an information bottleneck that often leads to brittleness and reward hacking in open-ended alignment. We argue that robust alignment for non-verifiable tasks is fundamentally a principle generalization problem: reward should not be a learned function internalized into a judge, but an explicit reasoning process executed under inspectable principles. To operationalize this view, we present the Open Rubric System (OpenRS), a plug-and-play, rubrics-based LLM-as-a-Judge framework built around Pairwise Adaptive Meta-Rubrics (PAMR) and lightweight Pointwise Verifiable Rubrics (PVRs), which provide both hard-constraint guardrails and verifiable reward components when ground-truth or programmatic checks are available. OpenRS uses an explicit meta-rubric -- a constitution-like specification that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Machine Learning and Data Classification