Shapley Uncertainty in Natural Language Generation
Meilin Zhu, Gaojie Jin, Xiaowei Huang, Lijun Zhang

TL;DR
This paper introduces a Shapley-based uncertainty metric for large language models that better captures semantic nuances and improves prediction of model performance in question-answering tasks.
Contribution
It develops a novel Shapley uncertainty framework that extends semantic entropy, satisfying key properties and outperforming baselines in predicting LLM performance.
Findings
Shapley uncertainty more accurately predicts LLM performance.
The framework captures continuous semantic relationships.
It outperforms existing baseline uncertainty measures.
Abstract
In question-answering tasks, determining when to trust the outputs is crucial to the alignment of large language models (LLMs). Kuhn et al. (2023) introduces semantic entropy as a measure of uncertainty, by incorporating linguistic invariances from the same meaning. It primarily relies on setting threshold to measure the level of semantic equivalence relation. We propose a more nuanced framework that extends beyond such thresholding by developing a Shapley-based uncertainty metric that captures the continuous nature of semantic relationships. We establish three fundamental properties that characterize valid uncertainty metrics and prove that our Shapley uncertainty satisfies these criteria. Through extensive experiments, we demonstrate that our Shapley uncertainty more accurately predicts LLM performance in question-answering and other datasets, compared to similar baseline measures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
