Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs

Chenjun Xu; Bingbing Wen; Bin Han; Robert Wolfe; Lucy Lu Wang; and Bill Howe

arXiv:2506.00582·cs.AI·July 29, 2025

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs

Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, and Bill Howe

PDF

1 Repo 1 Video

TL;DR

This paper investigates how large language models exhibit confidence biases similar to humans and introduces a new self-assessment method, AFCE, to improve their confidence calibration and interpretability.

Contribution

The paper reveals that LLMs show biased confidence patterns and proposes AFCE, a two-stage prompting method, to enhance their confidence accuracy and reduce overconfidence.

Findings

01

Models exhibit less sensitivity to task difficulty than humans.

02

AFCE reduces overconfidence in LLMs.

03

AFCE improves alignment of model confidence with actual accuracy.

Abstract

Psychology research has shown that humans are poor at estimating their performance on tasks, tending towards underconfidence on easy tasks and overconfidence on difficult tasks. We examine three LLMs, Llama-3-70B-instruct, Claude-3-Sonnet, and GPT-4o, on a range of QA tasks of varying difficulty, and show that models exhibit subtle differences from human patterns of overconfidence: less sensitive to task difficulty, and when prompted to answer based on different personas -- e.g., expert vs layman, or different race, gender, and ages -- the models will respond with stereotypically biased confidence estimations even though their underlying answer accuracy remains the same. Based on these observations, we propose Answer-Free Confidence Estimation (AFCE) to improve confidence calibration and LLM interpretability in these settings. AFCE is a self-assessment method that employs two stages of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenjux/afce
noneOfficial

Videos

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs· underline