Entropy-Gradient Inversion: Moving Toward Internal Mechanism of Large Reasoning Models
Junyao Yang, Chen Qian, Kun Wang, Linfeng Zhang, Quanshi Zhang, Yong Liu, Dongrui Liu

TL;DR
This paper introduces Entropy-Gradient Inversion as a geometric signature of reasoning in large models and proposes CorR-PO, a regularization method that improves reasoning performance across benchmarks.
Contribution
It formally defines entropy-gradient inversion, links it to reasoning ability, and develops CorR-PO to enhance reasoning in large models through RL reward regularization.
Findings
Entropy-gradient inversion correlates with reasoning capability.
CorR-PO outperforms state-of-the-art methods on reasoning benchmarks.
Stronger inversion signals lead to better reasoning performance.
Abstract
The advancement of Large Reasoning Models (LRMs) has catalyzed a paradigm shift from reactive ``fast thinking'' text generation to systematic, step-by-step ``slow thinking'' reasoning, unlocking state-of-the-art performance in complex mathematical and logical tasks. However, the field faces \textit{the fundamental gap between token-level behavioral analysis and internal reasoning mechanisms, and the instability of reinforcement learning (RL) for reasoning optimization relying on costly external verifiers}. We identify and formally define \textbf{Entropy-Gradient Inversion}, a robust negative correlation between token entropy and logit gradients that acts as a definitive geometric fingerprint for LRM reasoning capability. Building on this, we propose \textbf{Correlation-Regularized Group Policy Optimization (CorR-PO)}, which embeds this inversion signature into RL reward regularization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
