A Unified Framework for Inference with General Missingness Patterns and Machine Learning Imputation
Xingran Chen, Tyler McCormick, Bhramar Mukherjee, Zhenke Wu

TL;DR
This paper introduces a new statistical inference framework that effectively handles various missing data patterns using machine learning imputations under the realistic MAR assumption, improving accuracy and efficiency.
Contribution
It develops a novel method for valid inference with ML imputations across general missingness patterns under MAR, with theoretical guarantees and practical implementation strategies.
Findings
The proposed estimator is asymptotically normal.
It demonstrates efficiency gains over complete-case analysis.
Validated through extensive simulations and real data examples.
Abstract
Pre-trained machine learning (ML) predictions have been increasingly used to complement incomplete data to enable downstream scientific inquiries, but their naive integration risks biased inferences. Recently, multiple methods have been developed to provide valid inference with ML imputations regardless of prediction quality and to enhance efficiency relative to complete-case analyses. However, existing approaches are often limited to missing outcomes under a missing-completely-at-random (MCAR) assumption, failing to handle general missingness patterns (missing in both the outcome and exposures) under the more realistic missing-at-random (MAR) assumption. This paper develops a novel method that delivers a valid statistical inference framework for general Z-estimation problems using ML imputations under the MAR assumption and for general missingness patterns. The core technical idea is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Advanced Causal Inference Techniques · Statistical Methods and Inference
