The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study

Victoria Lin; Taedong Yun; Maja Matari\'c; John Canny; Arthur Gretton; Alexander D'Amour

arXiv:2605.20767·cs.CL·May 21, 2026

The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study

Victoria Lin, Taedong Yun, Maja Matari\'c, John Canny, Arthur Gretton, Alexander D'Amour

PDF

TL;DR

This paper examines how interventions in LLM-simulated experiments can cause user drift, leading to biased effect estimates, and proposes methods to diagnose and mitigate this bias.

Contribution

It formalizes the confounding effects of user drift in LLM experiments and introduces diagnostic and adjustment techniques to reduce bias.

Findings

01

Negative control outcomes can detect distribution shifts caused by user drift.

02

Targeted confounder elicitation reduces bias in LLM intervention studies.

Abstract

Large language models (LLMs) show potential as simulators of human behavior, offering a scalable way to study responses to interventions. However, because LLMs are trained largely on observational data, interventions in experiments with LLM-simulated synthetic users can induce unintended shifts in latent user attributes, causing user drift where the implicit simulated population differs across treatment conditions, potentially distorting effect estimates. We formalize the confounding or selection bias that can arise due to user drift and show how intervention-dependent shifts can inflate or attenuate observed differences in user responses under intervention. To diagnose confounding, we propose using negative control outcomes--attributes that should remain invariant under intervention--to identify distribution shifts across intervention conditions, providing evidence of user drift. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.