Large Language Models are overconfident and amplify human bias
Fengfei Sun, Ningke Li, Kailong Wang, Lorenz Goette

TL;DR
This paper investigates overconfidence in large language models, revealing they are significantly overconfident compared to humans, especially when uncertain, and that their input influences human decision-making unpredictably.
Contribution
It provides the first systematic evaluation of overconfidence in LLMs, demonstrating their overconfidence levels and effects on human decision-making.
Findings
LLMs are 20-60% overconfident in their answers.
Humans have similar accuracy but lower overconfidence than LLMs.
LLM input increases both accuracy and overconfidence in humans.
Abstract
Large language models (LLMs) are revolutionizing every aspect of society. They are increasingly used in problem-solving tasks to substitute human assessment and reasoning. LLMs are trained on what humans write and are thus exposed to human bias. We evaluate whether LLMs inherit one of the most widespread human biases: overconfidence. We algorithmically construct reasoning problems with known ground truths. We prompt LLMs to answer these problems and assess the confidence in their answers, closely following similar protocols in human experiments. We find that all five LLMs we study are overconfident: they overestimate the probability that their answer is correct between 20% and 60%. Humans have accuracy similar to the more advanced LLMs, but far lower overconfidence. Although humans and LLMs are similarly biased in questions which they are certain they answered correctly, a key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
