Embedding Perturbation may Better Reflect Intermediate-Step Uncertainty in LLM Reasoning
Qihao Wen, Jiahao Wang, Yang Nan, Pengfei He, Ravi Tandon, Han Xu

TL;DR
This paper proposes a perturbation-based uncertainty quantification method for LLM reasoning, effectively identifying uncertain intermediate steps by measuring token sensitivity to embedding perturbations, outperforming existing baselines.
Contribution
It introduces a novel perturbation sensitivity metric that better reflects intermediate-step uncertainty in LLM reasoning tasks, improving detection of unreliable reasoning steps.
Findings
Perturbation sensitivity correlates with incorrect reasoning steps.
Perturbation-based metrics outperform probability, sampling, and Bayesian baselines.
The method is simple and computationally efficient.
Abstract
Large language Models (LLMs) have achieved significant breakthroughs across diverse domains; however, they can still produce unreliable or misleading outputs. For responsible LLM application, Uncertainty Quantification (UQ) techniques are used to estimate a model's uncertainty about its outputs, indicating the likelihood that those outputs may be problematic. For LLM reasoning tasks, it is essential to estimate the uncertainty not only for the final answer, but also for the intermediate steps of the reasoning, as this can enable more fine-grained and targeted interventions. In this study, we explore what UQ metrics better reflect the LLM's "intermediate uncertainty" during reasoning. Our study reveals that an LLM's incorrect reasoning steps tend to contain tokens which are highly sensitive to the perturbations on the preceding token embeddings, indicating the model's uncertainty among…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
