Influential Training Data Retrieval for Explaining Verbalized Confidence of LLMs
Yuxi Xia, Loris Schoenegger, Benjamin Roth

TL;DR
This paper introduces TracVC, a method to trace LLMs' verbalized confidence back to training data, revealing that models often rely on superficial cues rather than content grounding, which impacts trustworthiness.
Contribution
The paper presents TracVC, a novel influence estimation technique for analyzing the sources of LLMs' confidence expressions, and introduces content groundness as a new evaluation metric.
Findings
OLMo2-13B often relies on confidence data unrelated to queries.
Models tend to mimic superficial linguistic confidence cues.
Current training regimes may not ensure confidence is content-justified.
Abstract
Large language models (LLMs) can increase users' perceived trust by verbalizing confidence in their outputs. However, prior work has shown that LLMs are often overconfident, making their stated confidence unreliable since it does not consistently align with factual accuracy. To better understand the sources of this verbalized confidence, we introduce TracVC (\textbf{Trac}ing \textbf{V}erbalized \textbf{C}onfidence), a method that builds on information retrieval and influence estimation to trace generated confidence expressions back to the training data. We evaluate TracVC on OLMo and Llama models in a question answering setting, proposing a new metric, content groundness, which measures the extent to which an LLM grounds its confidence in content-related training examples (relevant to the question and answer) versus in generic examples of confidence verbalization. Our analysis reveals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
