RAGEN-2: Reasoning Collapse in Agentic RL
Zihan Wang, Chi Gui, Xing Jin, Qineng Wang, Licheng Liu, Kangrui Wang, Shiqi Chen, Linjie Li, Zhengyuan Yang, Pingyue Zhang, Yiping Lu, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li

TL;DR
This paper identifies a failure mode in RL-trained multi-turn LLM agents called template collapse, where models rely on input-agnostic templates, and proposes mutual information diagnostics and SNR-aware filtering to improve reasoning stability and task performance.
Contribution
It introduces a mutual information-based diagnostic for reasoning collapse and proposes SNR-aware filtering to enhance reasoning diversity and task success.
Findings
Mutual information correlates more strongly with performance than entropy.
Template collapse can occur even with stable entropy measures.
SNR-aware filtering improves input dependence and task performance across multiple tasks.
Abstract
RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot tell whether reasoning actually responds to different inputs. In RAGEN-2, we find that even with stable entropy, models can rely on fixed templates that look diverse but are input-agnostic. We call this template collapse, a failure mode invisible to entropy and all existing metrics. To diagnose this failure, we decompose reasoning quality into within-input diversity (Entropy) and cross-input distinguishability (Mutual Information, MI), and introduce a family of mutual information proxies for online diagnosis. Across diverse tasks, mutual information correlates with final performance much more strongly than entropy, making it a more reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
