Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models

Artem Vazhentsev; Ekaterina Fadeeva; Rui Xing; Gleb Kuzmin; Ivan Lazichny; Alexander Panchenko; Preslav Nakov; Timothy Baldwin; Maxim Panov; Artem Shelmanov

arXiv:2408.10692·cs.CL·April 29, 2026

Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models

Artem Vazhentsev, Ekaterina Fadeeva, Rui Xing, Gleb Kuzmin, Ivan Lazichny, Alexander Panchenko, Preslav Nakov, Timothy Baldwin, Maxim Panov, Artem Shelmanov

PDF

1 Video

TL;DR

This paper introduces a novel method for learning unconditional uncertainty scores in large language models using attention-based features, improving hallucination detection and output quality.

Contribution

It proposes a new approach that models unconditional uncertainty in LLMs through attention maps and a two-stage training process, outperforming existing methods.

Findings

01

Significant improvement in selective generation accuracy.

02

Effective detection of hallucinations across multiple datasets.

03

Outperforms rival unsupervised and supervised approaches.

Abstract

Uncertainty quantification (UQ) has emerged as a promising approach for detecting hallucinations and low-quality output of Large Language Models (LLMs). However, obtaining proper uncertainty scores is complicated by the conditional dependency between the generation steps of an autoregressive LLM because it is hard to model it explicitly. Here, we propose to learn this dependency from attention-based features. In particular, we train a regression model that leverages LLM attention maps, probabilities on the current generation step, and recurrently computed uncertainty scores from previously generated tokens. To incorporate the recurrent features, we also suggest a two-staged training procedure. Our experimental evaluation on ten datasets and three LLMs shows that the proposed method is highly effective for selective generation, achieving substantial improvements over rivaling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models· underline