CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs
Haoxi Li, Sikai Bai, Jie Zhang, Song Guo

TL;DR
This paper introduces CoRE, a label-free self-evaluation method for large reasoning models that improves reasoning efficiency and accuracy by detecting cyclical reasoning patterns without external labels.
Contribution
The paper proposes CoRE, a novel latent space embedding for self-evaluation in LRMs, and CoRE-Eval, a training-free framework to detect redundant reasoning, enhancing efficiency and accuracy.
Findings
Reduces reasoning steps by up to 33.2%
Improves accuracy by around 10% on benchmarks
Achieves 70% accuracy on AIME with 32B model
Abstract
Large reasoning models (LRMs) have demonstrated impressive capabilities in domains like mathematics and program synthesis. Despite their strong performance, LRMs often exhibit overthinking -- excessive and redundant reasoning steps that introduce inefficiencies during inference. This phenomenon raises an important question for LRM self-evaluation: How can a model autonomously assess the correctness of its own reasoning trajectory without external labels? To address this, we propose Chain-of-Reasoning Embedding (CoRE), a series of hidden states in latent space to enable label-free self-evaluation on intermediate reasoning steps of LRMs, so as to enhance metacognition abilities for improved reasoning efficiency. By analyzing the geometric properties of the CoRE trajectories, we reveal that redundant reasoning usually presents cyclical fluctuations, which correspond to repetitive and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
