When Should a Language Model Trust Itself? Same-Model Self-Verification as a Conditional Confidence Signal

Aditya Ajay Phalod

arXiv:2605.02915·cs.CL·May 6, 2026

When Should a Language Model Trust Itself? Same-Model Self-Verification as a Conditional Confidence Signal

Aditya Ajay Phalod

PDF

TL;DR

This paper evaluates the effectiveness of self-verification as a confidence signal for language models, finding it varies significantly across tasks, models, and baselines, and is not a universal uncertainty estimator.

Contribution

It provides a comprehensive empirical analysis of self-verification against likelihood baselines across multiple models and tasks, highlighting its conditional utility.

Findings

01

Self-verification improves over baselines on ARC-Challenge for certain models.

02

On TruthfulQA-MC, self-verification is less reliable and often underperforms baselines.

03

The utility of self-verification depends on task, model, prompt, and baseline.

Abstract

Same-model self-verification, prompting a model to audit its own predicted answer, is a plausible confidence signal for selective prediction, but its practical value remains unclear once strong likelihood-based baselines are taken seriously. We evaluate self-verification against two such baselines, LL-AVG and LL-SUM, on ARC-Challenge and TruthfulQA-MC across multiple model families, scales, and prompt variants. We measure not only correctness ranking, but also abstention quality through AURC and operating-point analyses. The results are sharply task- and model-dependent. On ARC-Challenge, self-verification substantially improves over LL-AVG for Phi-2 and the Qwen models, with the largest gains appearing in Qwen-7B. On TruthfulQA-MC, however, the signal is less reliable: smaller models can become prompt-sensitive, DeepSeek-R1-Distill-8B degrades relative to LL-AVG, and LL-SUM often…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.