Confidence Under the Hood: An Investigation into the   Confidence-Probability Alignment in Large Language Models

Abhishek Kumar; Robert Morabito; Sanzhar Umbet; Jad Kabbara; and Ali; Emami

arXiv:2405.16282·cs.CL·June 18, 2024·1 cites

Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models

Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, and Ali, Emami

PDF

Open Access 1 Repo

TL;DR

This paper investigates how well large language models' internal token probabilities align with their expressed confidence, revealing GPT-4's relatively strong alignment and contributing to risk assessment and trustworthiness evaluation.

Contribution

It introduces the concept of Confidence-Probability Alignment and evaluates this alignment across different models and prompting techniques, highlighting GPT-4's superior performance.

Findings

01

GPT-4 exhibits the strongest confidence-probability alignment among models tested.

02

Various prompting techniques can influence the perceived confidence of LLMs.

03

Alignment scores vary significantly across different models and tasks.

Abstract

As the use of Large Language Models (LLMs) becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence conveyed in the model's response when explicitly asked about its certainty. Using various datasets and prompting techniques that encourage model introspection, we probe the alignment between models' internal and expressed confidence. These techniques encompass using structured evaluation scales to rate confidence, including answer options when prompting, and eliciting the model's confidence level for outputs it does not recognize as its own. Notably, among the models analyzed, OpenAI's GPT-4…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

akkeshav/confidence_probability_alignment
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsLinear Layer · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections