From "Aha Moments" to Controllable Thinking: Toward Meta-Cognitive Reasoning in Large Reasoning Models via Decoupled Reasoning and Control
Rui Ha, Chaozhuo Li, Rui Pu, Sen Su

TL;DR
This paper introduces MERA, a framework that decouples reasoning and control in large reasoning models, enabling explicit regulation of reasoning processes to improve efficiency and accuracy.
Contribution
MERA is a novel framework that separates reasoning and control, utilizing auxiliary models and policy optimization to regulate reasoning in large models.
Findings
Improves reasoning efficiency and reduces latency.
Enhances reasoning accuracy on benchmarks.
Provides explicit reasoning-control traces.
Abstract
Large Reasoning Models (LRMs) have demonstrated a latent capacity for complex reasoning by spontaneously exhibiting cognitive behaviors such as step-by-step reasoning, reflection, and backtracking, commonly referred to as "Aha Moments". However, such emergent behaviors remain unregulated and uncontrolled, often resulting in overthinking, where the model continues generating redundant reasoning content even after reaching reliable conclusions. This leads to excessive computational costs and increased latency, limiting the practical deployment of LRMs. The root cause lies in the absence of intrinsic regulatory mechanisms, as current models are unable to monitor and adaptively manage their reasoning process to determine when to continue, backtrack, or terminate. To address this issue, we propose the Meta-cognitive Reasoning Framework (MERA), which explicitly decouples the thinking process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
