Gradients with Respect to Semantics Preserving Embeddings Tell the Uncertainty of Large Language Models
Mingda Li, Rundong Lv, Xinyu Li, Weinan Zhang, Ting Liu

TL;DR
This paper introduces SemGrad, a novel gradient-based uncertainty quantification method for large language models that operates in semantic space, offering a sampling-free, efficient alternative to existing approaches.
Contribution
It proposes the first gradient-based UQ method in semantic space for free-form generation, improving efficiency and effectiveness over prior sampling-based techniques.
Findings
SemGrad provides superior uncertainty estimates compared to state-of-the-art methods.
The methods are particularly effective in scenarios with multiple valid responses.
Experiments show improved performance in confidence estimation for LLM outputs.
Abstract
Uncertainty quantification (UQ) is an important technique for ensuring the trustworthiness of LLMs, given their tendency to hallucinate. Existing state-of-the-art UQ approaches for free-form generation rely heavily on sampling, which incurs high computational cost and variance. In this work, we propose the first gradient-based UQ method for free-form generation, SemGrad, which is sampling-free and computationally efficient. Unlike prior gradient-based methods developed for classification tasks that operates in parameter space, we propose to consider gradients in semantic space. Our method builds on the key intuition that a confident LLM should maintain stable output distributions under semantically equivalent input perturbations. We interpret the stability as the gradients in semantic space and introduce a Semantic Preservation Score (SPS) to identify embeddings that best capture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
