Towards Generation-Efficient Uncertainty Estimation in Large Language Models
Mingcheng Zhu, Yu Liu, Tingting Zhu

TL;DR
This paper explores efficient uncertainty estimation in large language models by using partial generation or input-only information, reducing inference costs while maintaining accuracy.
Contribution
It introduces a unified framework for uncertainty estimation, proposes new methods Logit Magnitude and MetaUE, and demonstrates their effectiveness with less generation needed.
Findings
Logit Magnitude performs strongly in experiments.
Partial generations often suffice for effective uncertainty estimation.
MetaUE offers a competitive input-only uncertainty predictor.
Abstract
Uncertainty estimation is important for deploying LLMs in high-stakes applications such as healthcare and finance, where hallucinations can appear fluent and plausible while being factually incorrect, making it difficult for users to judge whether an output should be trusted. Existing methods require one or more full autoregressive generations to estimate uncertainty, which introduces substantial inference cost and often delays uncertainty assessment. In this paper, we investigate whether effective uncertainty estimation can be achieved with partial generation or even input-only information. Specifically, we first develop a unified framework that formulates uncertainty estimation as an early estimation problem over the autoregressive generation process of LLMs. This framework organises existing and proposed estimators by the information they observe, ranging from multi-generation to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
