How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models
Dharshan Kumaran, Stephen M Fleming, Larisa Markeeva, Joe Heyward, Andrea Banino, Mrinal Mathur, Razvan Pascanu, Simon Osindero, Benedetto de Martino, Petar Velickovic, Viorica Patraucean

TL;DR
This paper investigates the conflicting behaviors of large language models, revealing how overconfidence and underconfidence are modulated by mechanisms like choice-support bias and advice weighting, affecting their willingness to change opinions.
Contribution
It introduces a novel experimental paradigm to analyze LLM confidence, uncovering mechanisms behind their resistance to change and sensitivity to contradictory advice, which were previously not well understood.
Findings
LLMs show a choice-supportive bias increasing confidence in initial answers.
LLMs overweight inconsistent advice, deviating from Bayesian norms.
These mechanisms explain LLMs' stubbornness and sensitivity to criticism.
Abstract
Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers whilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
