Checklists Are Better Than Reward Models For Aligning Language Models

Vijay Viswanathan; Yanchao Sun; Shuang Ma; Xiang Kong; Meng Cao; Graham Neubig; Tongshuang Wu

arXiv:2507.18624·cs.CL·December 2, 2025

Checklists Are Better Than Reward Models For Aligning Language Models

Vijay Viswanathan, Yanchao Sun, Shuang Ma, Xiang Kong, Meng Cao, Graham Neubig, Tongshuang Wu

PDF

Open Access 1 Models 1 Datasets 1 Video

TL;DR

This paper introduces Reinforcement Learning from Checklist Feedback (RLCF), a novel method that uses flexible, instruction-specific checklists evaluated by AI judges to improve language model alignment across multiple benchmarks.

Contribution

The paper proposes RLCF, a new reinforcement learning approach utilizing checklist-based feedback, which outperforms existing methods on various instruction-following benchmarks.

Findings

01

RLCF improves performance on all five benchmarks tested.

02

Significant gains include a 4-point boost on FollowBench.

03

RLCF enhances language models' ability to follow diverse instructions.

Abstract

Language models must be adapted to understand and follow user instructions. Reinforcement learning is widely used to facilitate this -- typically using fixed criteria such as "helpfulness" and "harmfulness". In our work, we instead propose using flexible, instruction-specific criteria as a means of broadening the impact that reinforcement learning can have in eliciting instruction following. We propose "Reinforcement Learning from Checklist Feedback" (RLCF). From instructions, we extract checklists and evaluate how well responses satisfy each item - using both AI judges and specialized verifier programs - then combine these scores to compute rewards for RL. We compare RLCF with other alignment methods applied to a strong instruction following model (Qwen2.5-7B-Instruct) on five widely-studied benchmarks -- RLCF is the only method to improve performance on every benchmark, including a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
viswavi/qwen2.5_rlcf
model· 5 dl· ♡ 1
5 dl♡ 1

Datasets

viswavi/wildchecklists
dataset· 73 dl
73 dl

Videos

Checklists Are Better Than Reward Models For Aligning Language Models· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling