Challenges of Using Text Classifiers for Causal Inference
Zach Wood-Doughty, Ilya Shpitser, and Mark Dredze

TL;DR
This paper explores the potential and challenges of using text classifiers for causal inference, extending causal analysis methods to language data and demonstrating their application on simulated and real datasets.
Contribution
It introduces a framework for applying causal inference techniques to text classifiers, addressing a gap in existing methods for language data.
Findings
Successful causal analysis on simulated data
Application of methods to Yelp review data
Discussion of opportunities and challenges for future research
Abstract
Causal understanding is essential for many kinds of decision-making, but causal inference from observational data has typically only been applied to structured, low-dimensional datasets. While text classifiers produce low-dimensional outputs, their use in causal inference has not previously been studied. To facilitate causal analyses based on language data, we consider the role that text classifiers can play in causal inference through established modeling mechanisms from the causality literature on missing data and measurement error. We demonstrate how to conduct causal analyses using text classifiers on simulated and Yelp data, and discuss the opportunities and challenges of future work that uses text data in causal inference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Advanced Graph Neural Networks · Explainable Artificial Intelligence (XAI)
MethodsCausal inference
