Dual Traits in Probabilistic Reasoning of Large Language Models
Shenxiong Li, Huaxia Rui

TL;DR
This paper explores how large language models evaluate probabilities, revealing they operate via two modes—normative and representative-based—highlighting challenges in aligning their judgments with Bayesian principles and human-like reasoning.
Contribution
It uncovers the dual modes of probabilistic reasoning in LLMs and links them to training methods, providing insights into their cognitive biases and limitations.
Findings
LLMs exhibit two modes of probability judgment: normative and representative-based.
LLMs struggle to recall base rate information accurately.
Prompt engineering alone may not fully mitigate representative-based biases.
Abstract
We conducted three experiments to investigate how large language models (LLMs) evaluate posterior probabilities. Our results reveal the coexistence of two modes in posterior judgment among state-of-the-art models: a normative mode, which adheres to Bayes' rule, and a representative-based mode, which relies on similarity -- paralleling human System 1 and System 2 thinking. Additionally, we observed that LLMs struggle to recall base rate information from their memory, and developing prompt engineering strategies to mitigate representative-based judgment may be challenging. We further conjecture that the dual modes of judgment may be a result of the contrastive loss function employed in reinforcement learning from human feedback. Our findings underscore the potential direction for reducing cognitive biases in LLMs and the necessity for cautious deployment of LLMs in critical areas.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsBalanced Selection
