Learned-Rule-Augmented Large Language Model Evaluators

Jie Meng; Jin Mao

arXiv:2512.01958·cs.AI·December 2, 2025

Learned-Rule-Augmented Large Language Model Evaluators

Jie Meng, Jin Mao

PDF

Open Access

TL;DR

This paper introduces a rule-augmented evaluation framework for large language models, combining rule distillation and reinforcement learning to improve their generalization and alignment in diverse natural language generation evaluation tasks.

Contribution

It proposes a novel rule-augmented paradigm with rule distillation, Chain-of-Rule guidance, and reinforcement learning to enhance LLM evaluators' effectiveness and generalizability.

Findings

01

Improved evaluation accuracy across multiple NLG tasks.

02

Enhanced alignment of LLM evaluators with data and rules.

03

Demonstrated scalability and versatility of the approach.

Abstract

Large language models (LLMs) are predominantly used as evaluators for natural language generation (NLG) tasks, but their application to broader evaluation scenarios remains limited. In this work, we explore the potential of LLMs as general evaluators across diverse tasks. Although LLM-based evaluators have made progress in different areas, existing methods struggle to generalize due to their reliance on costly, human-designed evaluation principles, which are often misaligned with both annotated data and LLMs' understanding.To address these challenges, we propose a rule-augmented evaluation paradigm. First, we introduce a rule distillation method that automatically extracts scoring rules from data using an LLM-assisted Monte Carlo Tree Search (MCTS), alleviating scalability issues and improving alignment with data. Second, to enable LLMs to effectively apply the learned rules, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques