Multiple imputation and test-wise deletion for causal discovery with incomplete cohort data
Janine Witte, Ronja Foraita, Vanessa Didelez

TL;DR
This paper compares test-wise deletion and multiple imputation methods for causal discovery with incomplete data, showing their relative strengths and limitations through simulations and real-world application.
Contribution
It establishes conditions for causal structure recovery with test-wise deletion and evaluates multiple imputation's effectiveness in various data settings.
Findings
Both methods outperform list-wise deletion and single imputation.
Multiple imputation is especially effective with small datasets of homogeneous variables.
Neither method is best when datasets contain a mix of Gaussian and discrete variables.
Abstract
Causal discovery algorithms estimate causal graphs from observational data. This can provide a valuable complement to analyses focussing on the causal relation between individual treatment-outcome pairs. Constraint-based causal discovery algorithms rely on conditional independence testing when building the graph. Until recently, these algorithms have been unable to handle missing values. In this paper, we investigate two alternative solutions: Test-wise deletion and multiple imputation. We establish necessary and sufficient conditions for the recoverability of causal structures under test-wise deletion, and argue that multiple imputation is more challenging in the context of causal discovery than for estimation. We conduct an extensive comparison by simulating from benchmark causal graphs: As one might expect, we find that test-wise deletion and multiple imputation both clearly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Bayesian Modeling and Causal Inference · Statistical Methods and Bayesian Inference
