On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
Ziyu Wang, Chris Holmes

TL;DR
This paper introduces a Bayesian decision theory approach to quantify and calibrate subjective uncertainty in large language models' free-form responses, with applications to question answering and translation.
Contribution
It proposes a novel method for subjective and epistemic uncertainty quantification in black-box language models based on Bayesian principles.
Findings
Epistemic uncertainty correlates with model errors.
Proposed calibration improves response reliability.
Uncertainty-based data acquisition enhances in-context learning.
Abstract
Applications of large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging. This is due to the need to identify task-specific uncertainties (e.g., about the semantics) which appears difficult to define in general cases. This work addresses these challenges from a perspective of Bayesian decision theory, starting from the assumption that our utility is characterized by a similarity measure that compares a generated response with a hypothetical true response. We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration. We further derive a measure for epistemic uncertainty, based on a missing data perspective and its characterization as an excess risk. The proposed methods can be applied to black-box language models. We illustrate the methods on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Adam · Attention Dropout · Linear Layer · Multi-Head Attention · Dropout · Dense Connections · Cosine Annealing
