Recursive Equations For Imputation Of Missing Not At Random Data With Sparse Pattern Support
Trung Phung, Kyle Reese, Ilya Shpitser, Rohit Bhattacharya

TL;DR
This paper introduces a new imputation method called MISPR that effectively handles missing not at random (MNAR) data with unsupported missingness patterns, improving bias and applicability over traditional methods.
Contribution
The paper develops a novel characterization for the full data law in graphical models, enabling a new imputation algorithm that works under both MAR and MNAR without additional assumptions.
Findings
MISPR performs comparably to MICE under MAR.
MISPR yields less biased results under MNAR.
The method handles unsupported missing data patterns effectively.
Abstract
A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume the data are missing at random (MAR), and impose parametric or smoothing assumptions upon the imputing distributions in a way that allows imputation to proceed even if not all missingness patterns have support in the data. Such assumptions are unrealistic in practice, and induce model misspecification bias on any analysis performed after such imputation. In this paper, we provide a principled alternative. Specifically, we develop a new characterization for the full data law in graphical models of missing data. This characterization is constructive, is easily adapted for the calculation of imputation distributions for both MAR and MNAR (missing not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
