Mixed and missing data: a unified treatment with latent graphical models

Xiao Li; Jinzhu Jia; Yuan Yao

arXiv:1511.04656·stat.ME·November 17, 2015

Mixed and missing data: a unified treatment with latent graphical models

Xiao Li, Jinzhu Jia, Yuan Yao

PDF

Open Access

TL;DR

This paper introduces a unified latent Gaussian graphical model for handling mixed and missing data, enabling improved data analysis, imputation, and prediction across various applications.

Contribution

It develops a novel latent Gaussian model with a sparse inverse covariance estimation for mixed and missing data, outperforming existing methods in prediction and imputation.

Findings

01

Outperforms state-of-the-art methods on medical datasets

02

Better than random forest in prediction error when model is correct

03

More effective than hot deck imputation even if model is misspecified

Abstract

We propose to learn latent graphical models when data have mixed variables and missing values. This model could be used for further data analysis, including regression, classification, ranking etc. It also could be used for imputing missing values. We specify a latent Gaussian model for the data, where the categorical variables are generated by discretizing an unobserved variable and the latent variables are multivariate Gaussian. The observed data consists of two parts: observed Gaussian variables and observed categorical variables, where the latter part is considered as partially missing Gaussian variables. We use the Expectation-Maximization algorithm to fit the model. To prevent overfitting we use sparse inverse covariance estimation to obtain sparse estimate of the latent covariance matrix, equivalently, the graphical model. The fitted model then could be used for problems…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Bayesian Modeling and Causal Inference · Statistical Methods and Bayesian Inference