Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models

Artem Vazhentsev; Lyudmila Rvanova; Ivan Lazichny; Alexander Panchenko; Maxim Panov; Timothy Baldwin; Artem Shelmanov

arXiv:2502.14427·cs.CL·April 29, 2026

Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models

Artem Vazhentsev, Lyudmila Rvanova, Ivan Lazichny, Alexander Panchenko, Maxim Panov, Timothy Baldwin, Artem Shelmanov

PDF

1 Video

TL;DR

This paper introduces a novel density-based uncertainty quantification method for large language models, improving accuracy and efficiency in truthfulness assessment across various tasks and datasets.

Contribution

It adapts Mahalanobis Distance for text generation, creating a supervised UQ technique that outperforms existing methods in multiple evaluation scenarios.

Findings

01

Significant improvement over existing UQ methods in experiments.

02

Effective in sequence-level and claim-level tasks.

03

Generalizes well to out-of-domain data.

Abstract

Uncertainty quantification (UQ) is a prominent approach for eliciting truthful answers from large language models (LLMs). To date, information-based and consistency-based UQ have been the dominant UQ methods for text generation via LLMs. Density-based methods, despite being very effective for UQ in text classification with encoder-based models, have not been very successful with generative LLMs. In this work, we adapt Mahalanobis Distance (MD) - a well-established UQ technique in classification tasks - for text generation and introduce a new supervised UQ method. Our method extracts token embeddings from multiple layers of LLMs, computes MD scores for each token, and uses linear regression trained on these features to provide robust uncertainty scores. Through extensive experiments on eleven datasets, we demonstrate that our approach substantially improves over existing UQ methods,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models· underline