Mitigating LLM Hallucinations via Conformal Abstention

Yasin Abbasi Yadkori; Ilja Kuzborskij; David Stutz; Andr\'as Gy\"orgy,; Adam Fisch; Arnaud Doucet; Iuliya Beloshapka; Wei-Hung Weng; Yao-Yuan Yang,; Csaba Szepesv\'ari; Ali Taylan Cemgil; Nenad Tomasev

arXiv:2405.01563·cs.LG·May 6, 2024·2 cites

Mitigating LLM Hallucinations via Conformal Abstention

Yasin Abbasi Yadkori, Ilja Kuzborskij, David Stutz, Andr\'as Gy\"orgy,, Adam Fisch, Arnaud Doucet, Iuliya Beloshapka, Wei-Hung Weng, Yao-Yuan Yang,, Csaba Szepesv\'ari, Ali Taylan Cemgil, Nenad Tomasev

PDF

Open Access

TL;DR

This paper introduces a conformal abstention method for large language models that reliably reduces hallucinations by determining when the model should abstain, using self-evaluation and conformal prediction for theoretical guarantees.

Contribution

It proposes a novel abstention procedure leveraging self-evaluation and conformal prediction to control hallucination rates with theoretical guarantees.

Findings

01

Effectively bounds hallucination rates on question answering datasets

02

Maintains lower abstention rates on long response datasets

03

Achieves comparable accuracy on short answer datasets

Abstract

We develop a principled procedure for determining when a large language model (LLM) should abstain from responding (e.g., by saying "I don't know") in a general domain, instead of resorting to possibly "hallucinating" a non-sensical or incorrect answer. Building on earlier approaches that use self-consistency as a more reliable measure of model confidence, we propose using the LLM itself to self-evaluate the similarity between each of its sampled responses for a given query. We then further leverage conformal prediction techniques to develop an abstention procedure that benefits from rigorous theoretical guarantees on the hallucination rate (error rate). Experimentally, our resulting conformal abstention method reliably bounds the hallucination rate on various closed-book, open-domain generative question answering datasets, while also maintaining a significantly less conservative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBenford’s Law and Fraud Detection · Quantum Mechanics and Applications · Hallucinations in medical conditions