LLM-Extracted Covariates for Clinical Causal Inference: Rethinking Integration Strategies
Lei Liu, Jialin Chen, Kathy Macropol

TL;DR
This study evaluates how large language models can extract latent confounders from clinical notes to improve causal inference from electronic health records, demonstrating that effective integration enhances estimation accuracy.
Contribution
It systematically compares seven covariate-integration strategies for LLM-derived confounders, identifying the most effective methods for causal effect estimation in critical care.
Findings
LLM-augmented propensity scores significantly reduce bias in causal estimates.
Direct augmentation of propensity models with LLM covariates outperforms other strategies.
In real data, incorporating LLM-extracted covariates aligns treatment effect estimates with randomized trial results.
Abstract
Causal inference from electronic health records (EHR) is fundamentally limited by unmeasured confounding: critical clinical states such as frailty, goals of care, and mental status are documented in free-text notes but absent from structured data. Large language models can extract these latent confounders as interpretable, structured covariates, yet how to effectively integrate them into causal estimation pipelines has not been systematically studied. Using the MIMIC-IV database with 21,859 sepsis patients, we compare seven covariate-integration strategies for estimating the effect of early vasopressor initiation on 28-day mortality, spanning tabular-only baselines, traditional NLP representations, and three LLM-augmented approaches. A central finding is that not all integration strategies are equally effective: directly augmenting the propensity score model with LLM covariates achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
