How to Make Causal Inferences Using Texts
Naoki Egami, Christian J. Fong, Justin Grimmer, Margaret E. Roberts,, Brandon M. Stewart

TL;DR
This paper presents a framework for making causal inferences from text data by discovering measures, estimating latent representations, and addressing overfitting risks, demonstrated through social science experiments.
Contribution
It introduces a novel conceptual framework for causal inference using text, including methods for discovering measures and estimating latent text representations.
Findings
Framework enables causal inference with text data
Split-sample approach mitigates overfitting risks
Applied to experiments on immigration attitudes and bureaucratic response
Abstract
New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences with discovered measures as a treatment or outcome. Our framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes. We argue that nearly all text-based causal inferences depend upon a latent representation of the text and we provide a framework to learn the latent representation. But estimating this latent representation, we show, creates new risks: we may introduce an identification problem or overfit. To address these risks we describe a split-sample framework and apply it to estimate causal effects from an experiment on immigration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Electoral Systems and Political Participation · Media Influence and Politics
