Enhancing Trust in Large Language Models via Uncertainty-Calibrated Fine-Tuning
Ranganath Krishnan, Piyush Khanna, Omesh Tickoo

TL;DR
This paper introduces an uncertainty-aware fine-tuning method for large language models that improves the calibration of uncertainty estimates, enhances trustworthiness, and aids in hallucination detection in natural language generation.
Contribution
It proposes a novel uncertainty-aware causal language modeling loss function that improves uncertainty calibration without sacrificing accuracy.
Findings
Better calibrated uncertainty estimates than standard fine-tuning.
Significantly improved hallucination detection capabilities.
Enhanced out-of-domain prompt identification.
Abstract
Large language models (LLMs) have revolutionized the field of natural language processing with their impressive reasoning and question-answering capabilities. However, these models are sometimes prone to generating credible-sounding but incorrect information, a phenomenon known as LLM hallucinations. Reliable uncertainty estimation in LLMs is essential for fostering trust in their generated responses and serves as a critical tool for the detection and prevention of erroneous or hallucinated outputs. To achieve reliable and well-calibrated uncertainty quantification in open-ended and free-form natural language generation, we propose an uncertainty-aware fine-tuning approach for LLMs. This approach enhances the model's ability to provide reliable uncertainty estimates without compromising accuracy, thereby guiding them to produce more trustworthy responses. We introduce a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
