Imputation of Missing Data Using Linear Gaussian Cluster-Weighted   Modeling

Luis Alejandro Masmela-Caita; Thais Paiva Galletti; Marcos Oliveira; Prates

arXiv:2110.12514·stat.ME·October 26, 2021

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Luis Alejandro Masmela-Caita, Thais Paiva Galletti, Marcos Oliveira, Prates

PDF

Open Access

TL;DR

This paper introduces a Bayesian Gaussian Cluster-Weighted modeling approach for imputing missing data in datasets with univariate missing patterns, leveraging auxiliary variables to improve accuracy.

Contribution

It proposes a novel imputation method based on Gaussian mixture models within a Bayesian framework, specifically designed for univariate missing data with auxiliary information.

Findings

01

Method outperforms existing techniques in simulations

02

Effective in diverse missing data scenarios

03

Demonstrated on real-world dataset

Abstract

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data is fully observed. An alternative to deal with incomplete databases is to fill in the spaces corresponding to the missing information based on some criteria, this technique is called imputation. We introduce a new imputation methodology for databases with univariate missing patterns based on additional information from fully-observed auxiliary variables. We assume that the non-observed variable is continuous, and that auxiliary variables assist to improve the imputation capacity of the model. In a fully Bayesian framework, our method uses a flexible mixture of multivariate normal distributions to model the response and the auxiliary variables jointly.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Soil Geostatistics and Mapping