To Believe or Not to Believe Your LLM

Yasin Abbasi Yadkori; Ilja Kuzborskij; Andr\'as Gy\"orgy; Csaba; Szepesv\'ari

arXiv:2406.02543·cs.LG·July 18, 2024·3 cites

To Believe or Not to Believe Your LLM

Yasin Abbasi Yadkori, Ilja Kuzborskij, Andr\'as Gy\"orgy, Csaba, Szepesv\'ari

PDF

Open Access

TL;DR

This paper introduces an information-theoretic approach to quantify and detect epistemic uncertainty in large language models, enabling identification of unreliable responses and hallucinations through iterative prompting.

Contribution

It presents a novel metric for distinguishing epistemic from aleatoric uncertainty in LLMs using only model outputs, improving reliability assessment.

Findings

01

The proposed metric reliably detects high epistemic uncertainty.

02

Iterative prompting amplifies output probabilities, aiding uncertainty detection.

03

The method outperforms standard strategies in identifying hallucinations.

Abstract

We explore uncertainty quantification in large language models (LLMs), with the goal to identify when uncertainty in responses given a query is large. We simultaneously consider both epistemic and aleatoric uncertainties, where the former comes from the lack of knowledge about the ground truth (such as about facts or the language), and the latter comes from irreducible randomness (such as multiple possible answers). In particular, we derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large, in which case the output of the model is unreliable. This condition can be computed based solely on the output of the model obtained simply by some special iterative prompting based on the previous responses. Such quantification, for instance, allows to detect hallucinations (cases when epistemic uncertainty is high) in both single- and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComparative and International Law Studies · Artificial Intelligence in Law