When simulations look right but causal effects go wrong: Large language models as behavioral simulators

Zonghan Li; Feng Ji

arXiv:2604.02458·cs.CY·April 14, 2026

When simulations look right but causal effects go wrong: Large language models as behavioral simulators

Zonghan Li, Feng Ji

PDF

TL;DR

This study evaluates large language models' ability to simulate behavioral responses to climate interventions, finding they replicate attitudes well but often misestimate causal effects, especially for behavior-related outcomes.

Contribution

It highlights the divergence between descriptive accuracy and causal fidelity in LLM simulations of interventions, emphasizing caution in interpreting their causal inferences.

Findings

01

LLMs reasonably replicate observed attitudinal patterns.

02

Prompt refinements improve descriptive fit but not causal accuracy.

03

Errors vary across intervention types and behavioral outcomes.

Abstract

Behavioral simulation is increasingly used to anticipate responses to interventions. Large language models (LLMs) enable researchers to specify population characteristics and intervention context in natural language, but it remains unclear to what extent LLMs can use these inputs to infer intervention effects. We evaluated three LLMs on 11 climate-psychology interventions using a dataset of 59,508 participants from 62 countries, and replicated the main analysis in two additional datasets (12 and 27 countries). LLMs reproduced observed patterns in attitudinal outcomes (e.g., climate beliefs and policy support) reasonably well, and prompting refinements improved this descriptive fit. However, descriptive fit did not reliably translate into causal fidelity (i.e., accurate estimates of intervention effects), and these two dimensions of accuracy followed different error structures. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.