Quantifying Uncertainty in Natural Language Explanations of Large   Language Models

Sree Harsha Tanneru; Chirag Agarwal; Himabindu Lakkaraju

arXiv:2311.03533·cs.CL·November 8, 2023·1 cites

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju

PDF

Open Access 1 Repo

TL;DR

This paper introduces two novel metrics to quantify the uncertainty in LLM explanations, revealing that probing uncertainty correlates with explanation faithfulness, thus advancing trustworthiness assessment of large language models.

Contribution

It proposes verbalized and probing uncertainty metrics for LLM explanations, with empirical analysis showing probing uncertainty's correlation with explanation faithfulness.

Findings

01

Verbalized uncertainty is unreliable for estimating explanation confidence.

02

Probing uncertainty correlates with explanation faithfulness.

03

Lower probing uncertainty indicates more faithful explanations.

Abstract

Large Language Models (LLMs) are increasingly used as powerful tools for several high-stakes natural language processing (NLP) applications. Recent prompting works claim to elicit intermediate reasoning steps and key tokens that serve as proxy explanations for LLM predictions. However, there is no certainty whether these explanations are reliable and reflect the LLMs behavior. In this work, we make one of the first attempts at quantifying the uncertainty in explanations of LLMs. To this end, we propose two novel metrics -- $Verbalized Uncertainty$ and $Probing Uncertainty$ -- to quantify the uncertainty of generated explanations. While verbalized uncertainty involves prompting the LLM to express its confidence in its explanations, probing uncertainty leverages sample and model perturbations as a means to quantify the uncertainty. Our empirical analysis of benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

harsha070/uncertainty-quantification-nle
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques