Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinations
Christian Tomani, Kamalika Chaudhuri, Ivan Evtimov, Daniel Cremers and, Mark Ibrahim

TL;DR
This paper demonstrates that uncertainty-based abstention methods significantly improve the safety, correctness, and hallucination reduction of large language models in question-answering tasks by selectively abstaining from uncertain responses.
Contribution
It introduces and evaluates uncertainty measures, including In-Dialogue Uncertainty, for LLM abstention, enhancing reliability without substantial computational costs.
Findings
Correctness improved by 2% to 8%
Hallucinations reduced by 50%
Safety increased up to 99%
Abstract
A major barrier towards the practical deployment of large language models (LLMs) is their lack of reliability. Three situations where this is particularly apparent are correctness, hallucinations when given unanswerable questions, and safety. In all three cases, models should ideally abstain from responding, much like humans, whose ability to understand uncertainty makes us refrain from answering questions we don't know. Inspired by analogous approaches in classification, this study explores the feasibility and efficacy of abstaining while uncertain in the context of LLMs within the domain of question-answering. We investigate two kinds of uncertainties, statistical uncertainty metrics and a distinct verbalized measure, termed as In-Dialogue Uncertainty (InDU). Using these uncertainty measures combined with models with and without Reinforcement Learning with Human Feedback (RLHF), we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth Systems, Economic Evaluations, Quality of Life
