Imputation of mixed data with multilevel singular value decomposition
Fran\c{c}ois Husson (IRMAR), Julie Josse (CMAP, XPOP), Balasubramanian, Narasimhan, Genevi\`eve Robin (XPOP, CMAP)

TL;DR
This paper introduces a novel multilevel singular value decomposition method for imputing missing values in large, mixed, multilevel datasets, offering computational efficiency and applicability to real-world medical data.
Contribution
It presents the first SVD-based imputation method capable of handling mixed data types in multilevel datasets, improving speed and scalability over existing solutions.
Findings
Outperforms competitors in handling various dataset sizes
Computationally faster than existing methods
Successfully applied to medical data from multiple hospitals
Abstract
Statistical analysis of large data sets offers new opportunities to better understand many processes. Yet, data accumulation often implies relaxing acquisition procedures or compounding diverse sources. As a consequence, such data sets often contain mixed data, i.e. both quantitative and qualitative and many missing values. Furthermore, aggregated data present a natural \textit{multilevel} structure, where individuals or samples are nested within different sites, such as countries or hospitals. Imputation of multilevel data has therefore drawn some attention recently, but current solutions are not designed to handle mixed data, and suffer from important drawbacks such as their computational cost. In this article, we propose a single imputation method for multilevel data, which can be used to complete either quantitative, categorical or mixed data. The method is based on multilevel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models · Statistical Methods and Inference
