RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation
Sunzhu Li, Jiale Zhao, Miteto Wei, Huimin Ren, Yang Zhou, Jingwen Yang, Shunyu Liu, Kaike Zhang, Wei Chen

TL;DR
RubricHub introduces a large, multi-domain rubric dataset generated through an automated coarse-to-fine process, significantly improving the performance of reinforcement learning models in reasoning tasks.
Contribution
The paper presents a novel automated rubric generation framework and a large-scale dataset, enabling enhanced supervision for reasoning-intensive open-ended generation tasks.
Findings
RubricHub dataset contains approximately 110,000 examples across multiple domains.
Post-training with RubricHub improves model performance, achieving SOTA results on HealthBench.
The approach surpasses proprietary models like GPT-5 in specific reasoning benchmarks.
Abstract
Reinforcement Learning with Verifiable Rewards (RLVR) has driven substantial progress in reasoning-intensive domains like mathematics. However, optimizing open-ended generation remains challenging due to the lack of ground truth. While rubric-based evaluation offers a structured proxy for verification, existing methods suffer from scalability bottlenecks and coarse criteria, resulting in a supervision ceiling effect. To address this, we propose an automated Coarse-to-Fine Rubric Generation framework. By synergizing principle-guided synthesis, multi-model aggregation, and difficulty evolution, our approach produces comprehensive and highly discriminative criteria capable of capturing the subtle nuances. Based on this framework, we introduce RubricHub, a large-scale (110k) and multi-domain dataset. We validate its utility through a two-stage post-training pipeline comprising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning and Data Classification
