ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

Jianbo Lin; Xiaomin Yu; Yi Xin; Yifu Guo; Zhuosong Jiang; Zhongqi Yue; Weishi Wang; Heqing Zou; Chengwei Qin; Hui Xiong

arXiv:2605.15224·cs.AI·May 18, 2026

ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

Jianbo Lin, Xiaomin Yu, Yi Xin, Yifu Guo, Zhuosong Jiang, Zhongqi Yue, Weishi Wang, Heqing Zou, Chengwei Qin, Hui Xiong

PDF

1 Repo

TL;DR

ICRL introduces a reinforcement learning framework enabling language models to internalize self-critique, leading to improved performance without external feedback across reasoning tasks.

Contribution

The paper presents a novel joint training method for solvers and critics, improving model self-improvement and critique internalization using shared backbones and new stabilization techniques.

Findings

01

Achieved 6.4 and 7.0 point improvements on agentic and mathematical reasoning tasks.

02

Learned 8B critic performs comparably to 32B critics with fewer tokens.

03

Demonstrated consistent performance gains across diverse benchmarks.

Abstract

Large language model-based agents make mistakes, yet critique can often guide the same model toward correct behavior. However, when critique is removed, the model may fail again on the same query, indicating that it has not internalized the critique's guidance into its underlying capability. Meanwhile, a frozen critic cannot improve its feedback quality over time, limiting the potential for iterative self-improvement. To address this, we propose learning to internalize self-critique with reinforcement learning(ICRL), a novel framework that jointly trains a solver and a critic from a shared backbone to convert critique-induced success into unassisted solver ability. The critic is rewarded based on the solver's subsequent performance gain, incentivizing actionable feedback. To address the distribution shift between critique-conditioned and critique-free behavior, ICRL introduces a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brick-pid/ICRL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.