CausalNLP: A Practical Toolkit for Causal Inference with Text
Arun S. Maiya

TL;DR
CausalNLP is an open-source toolkit that enables causal inference using observational data with text, integrating linguistic features as treatments, outcomes, or confounders, thus broadening causal analysis capabilities beyond numerical and categorical data.
Contribution
It introduces a practical toolkit that incorporates raw text and linguistic features into causal inference models, extending traditional methods to handle textual data.
Findings
Supports treatment effect estimation with text data
Allows treating linguistic properties as variables
Open source implementation available
Abstract
Causal inference is the process of estimating the effect or impact of a treatment on an outcome with other covariates as potential confounders (and mediators) that may need to be controlled. The vast majority of existing methods and systems for causal inference assume that all variables under consideration are categorical or numerical (e.g., gender, price, enrollment). In this paper, we present CausalNLP, a toolkit for inferring causality with observational data that includes text in addition to traditional numerical and categorical variables. CausalNLP employs the use of meta learners for treatment effect estimation and supports using raw text and its linguistic properties as a treatment, an outcome, or a "controlled-for" variable (e.g., confounder). The library is open source and available at: https://github.com/amaiya/causalnlp.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Topic Modeling · Advanced Text Analysis Techniques
