A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Artem Shelmanov; Ekaterina Fadeeva; Akim Tsvigun; Ivan Tsvigun; Zhuohan Xie; Igor Kiselev; Nico Daheim; Caiqi Zhang; Artem Vazhentsev; Mrinmaya Sachan; Preslav Nakov; Timothy Baldwin

arXiv:2505.08200·cs.CL·May 14, 2025

A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Artem Shelmanov, Ekaterina Fadeeva, Akim Tsvigun, Ivan Tsvigun, Zhuohan Xie, Igor Kiselev, Nico Daheim, Caiqi Zhang, Artem Vazhentsev, Mrinmaya Sachan, Preslav Nakov, Timothy Baldwin

PDF

1 Repo

TL;DR

This paper introduces pre-trained uncertainty quantification heads for LLMs, significantly improving hallucination detection accuracy and generalization across domains and languages by leveraging Transformer architecture and attention features.

Contribution

The authors propose supervised UQ heads for LLMs that outperform unsupervised methods, enhancing hallucination detection and generalization across languages and domains.

Findings

01

State-of-the-art claim-level hallucination detection performance

02

Robustness across in-domain and out-of-domain prompts

03

Strong generalization to unseen languages

Abstract

Large Language Models (LLMs) have the tendency to hallucinate, i.e., to sporadically generate false or fabricated information. This presents a major challenge, as hallucinations often appear highly convincing and users generally lack the tools to detect them. Uncertainty quantification (UQ) provides a framework for assessing the reliability of model outputs, aiding in the identification of potential hallucinations. In this work, we introduce pre-trained UQ heads: supervised auxiliary modules for LLMs that substantially enhance their ability to capture uncertainty compared to unsupervised UQ methods. Their strong performance stems from the powerful Transformer architecture in their design and informative features derived from LLM attention maps. Experimental evaluation shows that these heads are highly robust and achieve state-of-the-art performance in claim-level hallucination detection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iinemo/llm-uncertainty-head
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax · Absolute Position Encodings