End-To-End Causal Effect Estimation from Unstructured Natural Language Data
Nikita Dhawan, Leonardo Cotta, Karen Ullrich, Rahul G. Krishnan, Chris, J. Maddison

TL;DR
This paper introduces NATURAL, a novel method leveraging large language models to estimate causal effects directly from unstructured text data, reducing reliance on manual data curation and enabling cost-effective, automated causal inference.
Contribution
The paper presents NATURAL, a new family of causal estimators that operate on unstructured text using LLMs, automating data curation and imputation for causal effect estimation.
Findings
NATURAL estimates are within 3 percentage points of ground truth.
The method performs well on both synthetic and real-world datasets.
It achieves accurate causal effect estimates even in complex clinical trial data.
Abstract
Knowing the effect of an intervention is critical for human decision-making, but current approaches for causal effect estimation rely on manual data collection and structuring, regardless of the causal assumptions. This increases both the cost and time-to-completion for studies. We show how large, diverse observational text data can be mined with large language models (LLMs) to produce inexpensive causal effect estimates under appropriate causal assumptions. We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text. Our estimators use LLM conditional distributions (over variables of interest, given the text data) to assist in the computation of classical estimators of causal effect. We overcome a number of technical challenges to realize this idea, such as automating data curation and using LLMs to impute missing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Bayesian Modeling and Causal Inference
