Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
Jon-Paul Cacioli

TL;DR
This paper introduces a new evaluation framework using Signal Detection Theory to measure the true metacognitive sensitivity of large language models, distinguishing between their knowledge and awareness of their knowledge.
Contribution
It develops a novel assessment method that separates a model's factual knowledge from its metacognitive awareness, revealing differences not captured by traditional calibration metrics.
Findings
Metacognitive efficiency varies across models with similar knowledge levels.
Metacognitive performance is domain-specific and model-dependent.
Temperature affects confidence criteria without changing metacognitive capacity.
Abstract
Standard evaluation of LLM confidence relies on calibration metrics (ECE, Brier score) that conflate two distinct capacities: how much a model knows (Type-1 sensitivity) and how well it knows what it knows (Type-2 metacognitive sensitivity). We introduce an evaluation framework based on Type-2 Signal Detection Theory that decomposes these capacities using meta-d' and the metacognitive efficiency ratio M-ratio. Applied to four LLMs (Llama-3-8B-Instruct, Mistral-7B-Instruct-v0.3, Llama-3-8B-Base, Gemma-2-9B-Instruct) across 224,000 factual QA trials, we find: (1) metacognitive efficiency varies substantially across models even when Type-1 sensitivity is similar -- Mistral achieves the highest d' but the lowest M-ratio; (2) metacognitive efficiency is domain-specific, with different models showing different weakest domains, invisible to aggregate metrics; (3) temperature manipulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Psychometric Methodologies and Testing
