Causal Effect Estimation with Latent Textual Treatments
Omri Feldman, Amar Venugopal, Jann Spiess, Amir Feder

TL;DR
This paper introduces a comprehensive pipeline for generating and estimating causal effects of latent textual interventions, addressing biases and challenges in text-based causal inference using autoencoders and covariate residualization.
Contribution
It presents a novel end-to-end method combining hypothesis generation, text variation, and bias mitigation for causal effect estimation with textual treatments.
Findings
Effective variation induction in target features
Bias reduction through covariate residualization
Robust causal effect estimates achieved
Abstract
Understanding the causal effects of text on downstream outcomes is a central task in many applications. Estimating such effects requires researchers to run controlled experiments that systematically vary textual features. While large language models (LLMs) hold promise for generating text, producing and evaluating controlled variation requires more careful attention. In this paper, we present an end-to-end pipeline for the generation and causal estimation of latent textual interventions. Our work first performs hypothesis generation and steering via sparse autoencoders (SAEs), followed by robust causal estimation. Our pipeline addresses both computational and statistical challenges in text-as-treatment experiments. We demonstrate that naive estimation of causal effects suffers from significant bias as text inherently conflates treatment and covariate information. We describe the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Topic Modeling · Computational and Text Analysis Methods
