A Copula-based Imputation Model for Missing Data of Mixed Type in Multilevel Data Sets
Jiali Wang, Bronwyn Loong, Anton H. Westveld, Alan H. Welsh

TL;DR
This paper introduces a copula-based latent variable model for imputing missing data of mixed types in multilevel datasets, effectively capturing variable relationships and clustering effects.
Contribution
It presents a novel copula-based imputation method that accounts for multilevel structure and variable types, improving accuracy over traditional methods.
Findings
Achieves good imputation accuracy in simulations
Enhances parameter recovery in clustered data
Adding random effects improves performance with strong clustering
Abstract
We propose a copula based method to handle missing values in multivariate data of mixed types in multilevel data sets. Building upon the extended rank likelihood of \cite{hoff2007extending} and the multinomial probit model, our model is a latent variable model which is able to capture the relationship among variables of different types as well as accounting for the clustering structure. We fit the model by approximating the posterior distribution of the parameters and the missing values through a Gibbs sampling scheme. We use the multiple imputation procedure to incorporate the uncertainty due to missing values in the analysis of the data. Our proposed method is evaluated through simulations to compare it with several conventional methods of handling missing data. We also apply our method to a data set from a cluster randomized controlled trial of a multidisciplinary intervention in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models · Statistical Methods and Inference
