Large Language Models Assume People are More Rational than We Really are

Ryan Liu; Jiayi Geng; Joshua C. Peterson; Ilia Sucholutsky; Thomas L.; Griffiths

arXiv:2406.17055·cs.CL·March 11, 2025·1 cites

Large Language Models Assume People are More Rational than We Really are

Ryan Liu, Jiayi Geng, Joshua C. Peterson, Ilia Sucholutsky, Thomas L., Griffiths

PDF

Open Access 1 Repo 3 Reviews

TL;DR

Large Language Models tend to assume humans are more rational than they are, aligning more with expected value theory than actual human decision-making, which impacts their communication and prediction of human behavior.

Contribution

This paper reveals that state-of-the-art LLMs overestimate human rationality, aligning their models with expected value theory rather than actual human decision patterns.

Findings

01

LLMs assume higher rationality than humans exhibit

02

Models align more with expected value theory than real human behavior

03

Inferences about others' decisions are highly correlated between LLMs and humans

Abstract

In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 5Confidence 4

Strengths

- The application of psychometric data of human for analyzing LLMs behaviors reveals very interesting insights between the gap of LM vs. human decision-making. - The study design and the analysis are rigorous. - The insights of "While we have focused on using theories and paradigms from psychology to analyze LLMs, there may also be opportunities to use LLMs to refine existing theories about people." is thought-provoking.

Weaknesses

- One might argue that these psychology tests used in this work appear to be quite simplistic to inform practical development of next generation LMs.

Reviewer 02Rating 6Confidence 3

Strengths

- applies research from the psychology field - good discussion of alignment

Weaknesses

see questions

Reviewer 03Rating 6Confidence 4

Strengths

- The authors explore a very interesting direction and implication in the current LLM alignment research. The current LLM alignment may actually limit the models’ ability to understand genuine human behavior. - The authors link their work nicely to existing well-grounded cognitive science studies, which will be helpful to other readers who are not familiar with this area.

Weaknesses

- While I also believe LLMs struggle to model irrationality, I’m not entirely convinced that the current experiment design fully supports this claim. - In the forward modeling section, the binary choice questions have an objectively optimal answer. Why should we conclude that the models are “assuming humans are more rational” simply because their answers are closer to this optimal? A simpler interpretation could be that the models are simply solving problems optimally, regardless of the prom

Code & Models

Repositories

theryanl/llm-rationality
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsALIGN