IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation

Bosi Wen; Yilin Niu; Cunxiang Wang; Pei Ke; Xiaoying Ling; Ying Zhang; Aohan Zeng; Hongning Wang; Minlie Huang

arXiv:2511.01014·cs.CL·April 17, 2026

IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation

Bosi Wen, Yilin Niu, Cunxiang Wang, Pei Ke, Xiaoying Ling, Ying Zhang, Aohan Zeng, Hongning Wang, Minlie Huang

PDF

1 Repo 2 Models

TL;DR

IF-CRITIC introduces a fine-grained, efficient LLM critic that improves instruction-following evaluation accuracy and reliability, enabling better model training with lower computational costs.

Contribution

The paper presents a novel LLM critic with a checklist-based approach and a multi-stage filtering mechanism, outperforming existing evaluation models.

Findings

01

IF-CRITIC surpasses strong LLM-as-a-Judge baselines in evaluation performance.

02

Using IF-CRITIC's reward signals enhances LLM instruction-following performance.

03

The approach reduces computational overhead compared to other LLM critic baselines.

Abstract

Instruction-following is a fundamental ability of Large Language Models (LLMs), requiring their generated outputs to follow multiple constraints imposed in input instructions. Numerous studies have attempted to enhance this ability through preference optimization or reinforcement learning based on reward signals from LLM-as-a-Judge. However, existing evaluation models for instruction-following still possess many deficiencies, such as substantial costs and unreliable assessments. To this end, we propose IF-CRITIC, an LLM critic for fine-grained, efficient, and reliable instruction-following evaluation. We first develop a checklist generator to decompose instructions and generate constraint checklists. With the assistance of the checklists, we collect high-quality critique training data through a multi-stage critique filtering mechanism and employ a constraint-level preference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-coai/IF-CRITIC
github

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.