Causal Estimation for Text Data with (Apparent) Overlap Violations

Lin Gui; Victor Veitch

arXiv:2210.00079·stat.ML·February 9, 2023·6 cites

Causal Estimation for Text Data with (Apparent) Overlap Violations

Lin Gui, Victor Veitch

PDF

Open Access 1 Video

TL;DR

This paper introduces a method for causal effect estimation in text data that overcomes overlap violations by learning representations that retain confounding information but remove treatment-predictive features, enabling robust causal inference.

Contribution

It proposes a supervised representation learning approach to address overlap violations in causal estimation with text data, ensuring valid adjustment and uncertainty quantification.

Findings

01

Significant bias reduction compared to baseline methods

02

Improved uncertainty quantification in causal estimates

03

Robustness to outcome misestimation

Abstract

Consider the problem of estimating the causal effect of some attribute of a text document; for example: what effect does writing a polite vs. rude email have on response time? To estimate a causal effect from observational data, we need to adjust for confounding aspects of the text that affect both the treatment and outcome -- e.g., the topic or writing level of the text. These confounding aspects are unknown a priori, so it seems natural to adjust for the entirety of the text (e.g., using a transformer). However, causal identification and estimation procedures rely on the assumption of overlap: for all levels of the adjustment variables, there is randomness leftover so that every unit could have (not) received treatment. Since the treatment here is itself an attribute of the text, it is perfectly determined, and overlap is apparently violated. The purpose of this paper is to show how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Causal Estimation for Text Data with (Apparent) Overlap Violations· slideslive

Taxonomy

TopicsAdvanced Causal Inference Techniques · Bayesian Modeling and Causal Inference