The Metacognitive Probe: Five Behavioural Calibration Diagnostics for LLMs

Rafael C. T. Oliveira

arXiv:2605.09844·cs.AI·May 12, 2026

The Metacognitive Probe: Five Behavioural Calibration Diagnostics for LLMs

Rafael C. T. Oliveira

PDF

TL;DR

The paper introduces the Metacognitive Probe, a diagnostic tool with five tasks to analyze LLM confidence behaviors across multiple dimensions, revealing significant within-model calibration differences.

Contribution

It presents a novel five-task diagnostic for decomposing LLM confidence behaviors, highlighting limitations of aggregate benchmarks and exposing nuanced calibration pockets.

Findings

01

Gemini 2.5 Flash shows high within-task calibration (88) but poor cross-task difficulty prediction (41).

02

The diagnostic reveals a 47-point dissociation in confidence calibration within a single model.

03

Composite benchmarks do not capture the full scope of model confidence and calibration issues.

Abstract

The Metacognitive Probe is an exploratory five-task, 15-slot diagnostic that decomposes an LLM's confidence behaviour into five behaviourally-distinct dimensions: confidence calibration (T1-CC), epistemic vigilance (T2-EV), knowledge boundary (T3-KB), calibration range (T4-CR), and reasoning-chain validation (T5-RCV). It is evaluated on N=8 frontier models and N=69 humans. The instrument is motivated by Flavell (1979) and Nelson and Narens (1990) but operates on observable confidence-correctness alignment; it is not a validated cross-species metacognition scale, and the pre-specified human developmental hypothesis was falsified. Composite benchmarks (MMLU, BIG-Bench, HELM, GPQA) ask whether a model produces a correct response. They are silent on whether the model knows when its response is wrong. A model can score 80 on a composite calibration benchmark and still be wildly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.