MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs

Gabrielle Kaili-May Liu; Gal Yona; Avi Caciularu; Idan Szpektor; Tim G. J. Rudner; Arman Cohan

arXiv:2505.24858·cs.CL·October 3, 2025

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs

Gabrielle Kaili-May Liu, Gal Yona, Avi Caciularu, Idan Szpektor, Tim G. J. Rudner, Arman Cohan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how well large language models communicate their uncertainty truthfully, finds they often fail, and introduces MetaFaith, a new prompt-based method that significantly improves their faithful uncertainty expression.

Contribution

The paper is the first systematic study of faithful uncertainty calibration in LLMs and proposes MetaFaith, a novel prompt-based approach inspired by human metacognition, to improve calibration.

Findings

01

LLMs largely fail at faithful uncertainty expression.

02

Standard prompts provide marginal improvements.

03

MetaFaith improves calibration by up to 61% and achieves 83% human-judged accuracy.

Abstract

A critical component in the trustworthiness of LLMs is reliable uncertainty communication, yet LLMs often use assertive language when conveying false claims, leading to over-reliance and eroded trust. We present the first systematic study of $faithful confidence calibration$ of LLMs, benchmarking models' ability to use linguistic expressions of uncertainty that $faithfully reflect$ their intrinsic uncertainty, across a comprehensive array of models, datasets, and prompting strategies. Our results demonstrate that LLMs largely fail at this task, and that existing interventions are insufficient: standard prompt approaches provide only marginal gains, and existing, factuality-based calibration techniques can even harm faithful calibration. To address this critical gap, we introduce MetaFaith, a novel prompt-based calibration approach inspired by human metacognition. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yale-nlp/metafaith
noneOfficial

Videos

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling