Diagnosing missing always at random in multivariate data
Iavor Bojinov, Natesh Pillai, Donald Rubin

TL;DR
This paper introduces diagnostic tests to assess the validity of the missing always at random assumption in multivariate data analysis, aiding researchers in identifying potential violations and guiding sensitivity analyses.
Contribution
It proposes three novel diagnostic tests for detecting violations of the missing always at random assumption in multivariate data with missing values.
Findings
The tests can identify when the missing always at random assumption is violated.
They help pinpoint which variables are likely causing the violation.
The approach facilitates targeted sensitivity analyses.
Abstract
Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable - a twofold assumption dependent on the mode of inference. The first part, which is the focus here, under the Bayesian and direct-likelihood paradigms, requires that the missing data are missing at random; in contrast, the frequentist-likelihood paradigm demands that the missing data mechanism always produces missing at random data, a condition known as missing always at random. Under certain regularity conditions, assuming missing always at random leads to an assumption that can be tested using the observed data alone namely, the missing data indicators only depend on fully observed variables. Here, we propose three different diagnostic tests that not only indicate when this assumption…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
