Deep Generative Pattern-Set Mixture Models for Nonignorable Missingness
Sahra Ghalebikesabi, Rob Cornish, Luke J. Kelly, Chris Holmes

TL;DR
This paper introduces a variational autoencoder-based model that effectively handles both ignorable and nonignorable missing data by clustering missingness patterns, achieving state-of-the-art imputation results especially with high nonignorable missingness.
Contribution
The paper presents a novel deep generative model that explicitly learns missingness pattern sets and models data under nonignorable missingness, improving imputation performance.
Findings
Achieves state-of-the-art imputation accuracy across various datasets.
Performs particularly well with high levels of nonignorable missing data.
Outperforms common imputation algorithms in challenging missingness scenarios.
Abstract
We propose a variational autoencoder architecture to model both ignorable and nonignorable missing data using pattern-set mixtures as proposed by Little (1993). Our model explicitly learns to cluster the missing data into missingness pattern sets based on the observed data and missingness masks. Underpinning our approach is the assumption that the data distribution under missingness is probabilistically semi-supervised by samples from the observed data distribution. Our setup trades off the characteristics of ignorable and nonignorable missingness and can thus be applied to data of both types. We evaluate our method on a wide range of data sets with different types of missingness and achieve state-of-the-art imputation performance. Our model outperforms many common imputation algorithms, especially when the amount of missing data is high and the missingness mechanism is nonignorable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Cancer-related molecular mechanisms research
MethodsSolana Customer Service Number +1-833-534-1729
