Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Siddharth Boppana; Annabel Ma; Max Loeffler; Raphael Sarfati; Eric Bigelow; Atticus Geiger; Owen Lewis; Jack Merullo

arXiv:2603.05488·cs.CL·March 13, 2026

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Siddharth Boppana, Annabel Ma, Max Loeffler, Raphael Sarfati, Eric Bigelow, Atticus Geiger, Owen Lewis, Jack Merullo

PDF

Open Access

TL;DR

This paper investigates whether large language models genuinely reason or merely simulate reasoning, revealing that models often display performative reasoning behaviors that do not reflect true internal beliefs, and proposes methods for detection and efficiency.

Contribution

The study introduces a new analysis of performative chain-of-thought, distinguishing genuine reasoning from superficial token generation, and proposes probe-guided early exit for efficient inference.

Findings

01

Models show task difficulty-specific differences in belief decoding.

02

Inflection points correlate with belief shifts indicating genuine uncertainty.

03

Probe-guided early exit reduces tokens significantly with maintained accuracy.

Abstract

We provide evidence of performative chain-of-thought (CoT) in reasoning models, where a model becomes strongly confident in its final answer, but continues generating tokens without revealing its internal belief. Our analysis compares activation probing, early forced answering, and a CoT monitor across two large models (DeepSeek-R1 671B & GPT-OSS 120B) and find task difficulty-specific differences: The model's final answer is decodable from activations far earlier in CoT than a monitor is able to say, especially for easy recall-based MMLU questions. We contrast this with genuine reasoning in difficult multihop GPQA-Diamond questions. Despite this, inflection points (e.g., backtracking, 'aha' moments) occur almost exclusively in responses where probes show large belief shifts, suggesting these behaviors track genuine uncertainty rather than learned "reasoning theater." Finally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbodied and Extended Cognition · Child and Animal Learning Development · Mind wandering and attention