Before You Interpret the Profile: Validity Scaling for LLM Metacognitive Self-Report

Jon-Paul Cacioli

arXiv:2604.17707·cs.CL·April 21, 2026

Before You Interpret the Profile: Validity Scaling for LLM Metacognitive Self-Report

Jon-Paul Cacioli

PDF

1 Repo

TL;DR

This paper adapts clinical validity scales to evaluate the response validity of large language models using metacognitive probes, identifying invalid models and proposing a screening protocol.

Contribution

It introduces a validity scaling framework for LLMs, operationalizes six indices, and demonstrates their effectiveness in detecting invalid models.

Findings

01

Valid-profile models show significant confidence-item correlation.

02

Invalid-profile models lack confidence-item correlation.

03

Chain-of-thought training causes opposite response distortions.

Abstract

Clinical personality assessment screens response validity before interpreting substantive scales. LLM evaluation does not. We apply the validity scaling framework from the PAI and MMPI-3 to metacognitive probe data from 20 frontier models across 524 items. Six validity indices are operationalised: L (maintaining confidence on errors), K (betting on errors), F (withdrawing consensus-endorsed items), Fp (withdrawing correct answers), RBS (inverted monitoring), and TRIN (fixed responding). A tiered classification system identifies four models as construct-level invalid and two as elevated. Valid-profile models produce item-sensitive confidence (mean r = .18, 14 of 16 significant). Invalid-profile models do not (mean r = -.20, d = 2.17, p = .001). Chain-of-thought training produces two opposite response distortions. Two latent dimensions account for 94.6% of index variance. Companion papers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

synthiumjp/validity-scaling-llm
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.