Toward Better Generalisation in Uncertainty Estimators: Leveraging Data-Agnostic Features

Thuy An Ha; Bao Quoc Vo

arXiv:2507.03998·cs.AI·July 8, 2025

Toward Better Generalisation in Uncertainty Estimators: Leveraging Data-Agnostic Features

Thuy An Ha, Bao Quoc Vo

PDF

TL;DR

This paper investigates combining data-agnostic features with hidden-state features to improve the out-of-domain generalisation of uncertainty estimators in LLMs, revealing mixed results and the importance of feature selection.

Contribution

It introduces a hybrid approach of data-agnostic and hidden-state features for uncertainty estimation, analyzing their impact on out-of-domain generalisation in LLMs.

Findings

01

Data-agnostic features often improve generalisation

02

In some cases, data-agnostic features degrade performance

03

Feature importance analysis shows underweighting of data-agnostic features

Abstract

Large Language Models (LLMs) often generate responses that are factually incorrect yet expressed with high confidence, which can pose serious risks for end users. To address this, it is essential for LLMs not only to produce answers but also to provide accurate estimates of their correctness. Uncertainty quantification methods have been introduced to assess the quality of LLM outputs, with factual accuracy being a key aspect of that quality. Among these methods, those that leverage hidden states to train probes have shown particular promise, as these internal representations encode information relevant to the factuality of responses, making this approach the focus of this paper. However, the probe trained on the hidden states of one dataset often struggles to generalise to another dataset of a different task or domain. To address this limitation, we explore combining data-agnostic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.