Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting
Guande He, Peng Cui, Jianfei Chen, Wenbo Hu, Jun Zhu

TL;DR
This paper evaluates how alignment affects the uncertainty calibration of language models in multiple-choice tasks, revealing overconfidence issues and proposing a simple calibration method to improve reliability.
Contribution
It systematically studies the impact of alignment on LM calibration, identifies two types of uncertainties, and introduces an effective post-hoc calibration approach.
Findings
Aligned LMs exhibit overconfidence compared to pre-trained models.
Two distinct uncertainties influence LM answers and format preferences.
A simple calibration method effectively improves LM uncertainty calibration.
Abstract
Despite the significant progress made in practical applications of aligned language models (LMs), they tend to be overconfident in output answers compared to the corresponding pre-trained LMs. In this work, we systematically evaluate the impact of the alignment process on logit-based uncertainty calibration of LMs under the multiple-choice setting. We first conduct a thoughtful empirical study on how aligned LMs differ in calibration from their pre-trained counterparts. Experimental results reveal that there are two distinct uncertainties in LMs under the multiple-choice setting, which are responsible for the answer decision and the format preference of the LMs, respectively. Then, we investigate the role of these two uncertainties on aligned LM's calibration through fine-tuning in simple synthetic alignment schemes and conclude that one reason for aligned LMs' overconfidence is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science
