Self-Evolving Critique Abilities in Large Language Models

Zhengyang Tang; Ziniu Li; Zhenyang Xiao; Tian Ding; Ruoyu Sun; Benyou Wang; Dayiheng Liu; Fei Huang; Tianyu Liu; Bowen Yu; Junyang Lin

arXiv:2501.05727·cs.CL·August 5, 2025

Self-Evolving Critique Abilities in Large Language Models

Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin

PDF

Open Access

TL;DR

This paper introduces SCRIT, a self-evolving framework for enhancing large language models' critique abilities through self-generated data, contrastive learning, and self-validation, leading to significant performance improvements without external supervision.

Contribution

The paper presents SCRIT, a novel self-training approach that improves LLM critique abilities using self-generated data and contrastive methods, eliminating the need for external annotations.

Findings

01

Achieves 10.0% gain in critique-correction accuracy

02

Improves error identification F1-score by 19.0%

03

Performance scales positively with data and model size

Abstract

Despite their remarkable performance, Large Language Models (LLMs) face a critical challenge: providing feedback for tasks where human evaluation is difficult or where LLMs potentially outperform humans. In such scenarios, leveraging the critique ability of LLMs themselves - identifying and correcting flaws - shows considerable promise. This paper explores enhancing critique abilities of LLMs, noting that current approaches rely on human annotations or more powerful models, leaving the challenge of improving critique abilities without external supervision unresolved. We introduce SCRIT (Self-evolving CRITic), a framework that trains LLMs with self-generated data to evolve their critique abilities. To address the low quality of naively generated data, we propose a contrastive-critic approach that uses reference solutions during data synthesis to enhance the model's understanding of key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making