PAIR-CI: Calibrated Conditional Independence Testing for Causal Discovery with Incomplete Data
Thomas S. Robinson, Ranjit Lall

TL;DR
PAIR-CI is a novel nonparametric conditional independence test that improves calibration in causal discovery with incomplete data by integrating multiple imputation and cross-validation, reducing false positives.
Contribution
It introduces PAIR-CI, the first formal unification of multiple imputation and cross-validation for calibrated CI testing in causal discovery.
Findings
PAIR-CI maintains below 5% false positive rate across various missing data mechanisms.
It reduces structural Hamming distance by up to 44% in large nonlinear graphs.
Existing tests show 28-45% false positives under MNAR, while PAIR-CI performs well.
Abstract
The standard constraint-based paradigm for causal discovery with incomplete data -- impute first, test second -- is frequently miscalibrated: any consistent conditional independence (CI) test rejects a true null with probability approaching 1 when imputation error induces spurious conditional dependence. We introduce PAIR-CI, a nonparametric CI test that restores calibration by integrating multiple imputation directly into the inferential procedure via a paired permutation design. PAIR-CI compares cross-validated models that include and exclude the candidate variable while receiving the same imputed conditioning set, forcing imputation error to cancel in their loss difference rather than contaminate the test statistic. A provably consistent variance estimator jointly accounts for uncertainty arising from cross-validation and multiple imputation -- to our knowledge, the first formal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
