Mitigating LLM Hallucinations via Conformal Abstention
Yasin Abbasi Yadkori, Ilja Kuzborskij, David Stutz, Andr\'as Gy\"orgy,, Adam Fisch, Arnaud Doucet, Iuliya Beloshapka, Wei-Hung Weng, Yao-Yuan Yang,, Csaba Szepesv\'ari, Ali Taylan Cemgil, Nenad Tomasev

TL;DR
This paper introduces a conformal abstention method for large language models that reliably reduces hallucinations by determining when the model should abstain, using self-evaluation and conformal prediction for theoretical guarantees.
Contribution
It proposes a novel abstention procedure leveraging self-evaluation and conformal prediction to control hallucination rates with theoretical guarantees.
Findings
Effectively bounds hallucination rates on question answering datasets
Maintains lower abstention rates on long response datasets
Achieves comparable accuracy on short answer datasets
Abstract
We develop a principled procedure for determining when a large language model (LLM) should abstain from responding (e.g., by saying "I don't know") in a general domain, instead of resorting to possibly "hallucinating" a non-sensical or incorrect answer. Building on earlier approaches that use self-consistency as a more reliable measure of model confidence, we propose using the LLM itself to self-evaluate the similarity between each of its sampled responses for a given query. We then further leverage conformal prediction techniques to develop an abstention procedure that benefits from rigorous theoretical guarantees on the hallucination rate (error rate). Experimentally, our resulting conformal abstention method reliably bounds the hallucination rate on various closed-book, open-domain generative question answering datasets, while also maintaining a significantly less conservative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBenford’s Law and Fraud Detection · Quantum Mechanics and Applications · Hallucinations in medical conditions
