The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study
Victoria Lin, Taedong Yun, Maja Matari\'c, John Canny, Arthur Gretton, Alexander D'Amour

TL;DR
This paper examines how interventions in LLM-simulated experiments can cause user drift, leading to biased effect estimates, and proposes methods to diagnose and mitigate this bias.
Contribution
It formalizes the confounding effects of user drift in LLM experiments and introduces diagnostic and adjustment techniques to reduce bias.
Findings
Negative control outcomes can detect distribution shifts caused by user drift.
Targeted confounder elicitation reduces bias in LLM intervention studies.
Abstract
Large language models (LLMs) show potential as simulators of human behavior, offering a scalable way to study responses to interventions. However, because LLMs are trained largely on observational data, interventions in experiments with LLM-simulated synthetic users can induce unintended shifts in latent user attributes, causing user drift where the implicit simulated population differs across treatment conditions, potentially distorting effect estimates. We formalize the confounding or selection bias that can arise due to user drift and show how intervention-dependent shifts can inflate or attenuate observed differences in user responses under intervention. To diagnose confounding, we propose using negative control outcomes--attributes that should remain invariant under intervention--to identify distribution shifts across intervention conditions, providing evidence of user drift. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
