How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models

Dharshan Kumaran; Stephen M Fleming; Larisa Markeeva; Joe Heyward; Andrea Banino; Mrinal Mathur; Razvan Pascanu; Simon Osindero; Benedetto de Martino; Petar Velickovic; Viorica Patraucean

arXiv:2507.03120·cs.LG·July 8, 2025

How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models

Dharshan Kumaran, Stephen M Fleming, Larisa Markeeva, Joe Heyward, Andrea Banino, Mrinal Mathur, Razvan Pascanu, Simon Osindero, Benedetto de Martino, Petar Velickovic, Viorica Patraucean

PDF

TL;DR

This paper investigates the conflicting behaviors of large language models, revealing how overconfidence and underconfidence are modulated by mechanisms like choice-support bias and advice weighting, affecting their willingness to change opinions.

Contribution

It introduces a novel experimental paradigm to analyze LLM confidence, uncovering mechanisms behind their resistance to change and sensitivity to contradictory advice, which were previously not well understood.

Findings

01

LLMs show a choice-supportive bias increasing confidence in initial answers.

02

LLMs overweight inconsistent advice, deviating from Bayesian norms.

03

These mechanisms explain LLMs' stubbornness and sensitivity to criticism.

Abstract

Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers whilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.