Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning
Yao-Hung Hubert Tsai, Walter Talbott, Jian Zhang

TL;DR
This paper introduces a non-parametric, efficient uncertainty quantification method for black-box large language models, enabling trustworthy decision planning without extensive computational costs.
Contribution
It presents a novel non-parametric uncertainty estimator for black-box LLMs and a systematic decision-making agent design that improves AI trustworthiness and efficiency.
Findings
Efficient uncertainty estimation with a single inference per input
Improved decision trustworthiness through statistical interpretation
Cost-effective AI agent development without access to token logits
Abstract
Step-by-step decision planning with large language models (LLMs) is gaining attention in AI agent development. This paper focuses on decision planning with uncertainty estimation to address the hallucination problem in language models. Existing approaches are either white-box or computationally demanding, limiting use of black-box proprietary LLMs within budgets. The paper's first contribution is a non-parametric uncertainty quantification method for LLMs, efficiently estimating point-wise dependencies between input-decision on the fly with a single inference, without access to token logits. This estimator informs the statistical interpretation of decision trustworthiness. The second contribution outlines a systematic design for a decision-making agent, generating actions like ``turn on the bathroom light'' based on user prompts such as ``take a bath''. Users will be asked to provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques
