Causal Inference with High-dimensional Discrete Covariates
Zhenghao Zeng, Sivaraman Balakrishnan, Yanjun Han, Edward H. Kennedy

TL;DR
This paper analyzes the challenges of estimating causal effects with high-dimensional discrete covariates, establishing bounds on estimator errors, and proposing improved methods under additional structural assumptions.
Contribution
It provides theoretical bounds for common estimators, derives minimax lower bounds, and introduces new estimators that leverage structure for faster convergence.
Findings
Estimator mean squared error bounded by d^2/n^2 + 1/n
Minimax lower bound of order d^2/(n^2 log^2 n) + 1/n
Proposed estimators achieve faster convergence under additional assumptions
Abstract
When estimating causal effects from observational studies, researchers often need to adjust for many covariates to deconfound the non-causal relationship between exposure and outcome, among which many covariates are discrete. The behavior of commonly used estimators in the presence of many discrete covariates is not well understood since their properties are often analyzed under structural assumptions including sparsity and smoothness, which do not apply in discrete settings. In this work, we study the estimation of causal effects in a model where the covariates required for confounding adjustment are discrete but high-dimensional, meaning the number of categories is comparable with or even larger than sample size . Specifically, we show the mean squared error of commonly used regression, weighting and doubly robust estimators is bounded by . We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference
