Enhancing Language Model Factuality via Activation-Based Confidence Calibration and Guided Decoding
Xin Liu, Farima Fatahi Bayat, Lu Wang

TL;DR
This paper introduces ActCab, an activation-based calibration method, and CoDec, a confidence-guided decoding strategy, to improve language model factuality and calibration efficiency, leading to more reliable and truthful responses.
Contribution
The paper presents a novel activation-based calibration method and a confidence-guided decoding strategy that significantly enhance language model factuality and calibration performance.
Findings
ActCab reduces expected calibration error by up to 39%.
CoDec improves factuality on challenging QA datasets.
Calibration signals help generate more truthful answers.
Abstract
Calibrating language models (LMs) aligns their generation confidence with the actual likelihood of answer correctness, which can inform users about LMs' reliability and mitigate hallucinated content. However, prior calibration methods, such as self-consistency-based and logit-based approaches, are either limited in inference-time efficiency or fall short of providing informative signals. Moreover, simply filtering out low-confidence responses reduces the LM's helpfulness when the answers are correct. Therefore, effectively using calibration techniques to enhance an LM's factuality remains an unsolved challenge. In this paper, we first propose an activation-based calibration method, ActCab, which trains a linear layer on top of the LM's last-layer activations that can better capture the representations of knowledge. Built on top of ActCab, we further propose CoDec, a confidence-guided…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsLinear Layer
