Bayesian data combination model with Gaussian process latent variable model for mixed observed variables under NMAR missingness
Masaki Mitsuhiro, Takahiro Hoshino

TL;DR
This paper introduces a novel Bayesian data fusion method using Gaussian process latent variables to handle non-MAR missing data across multiple datasets, improving estimate validity in social science and business research.
Contribution
It presents the first approach to address non-random dataset assignment in data fusion using a Gaussian process latent variable model under reasonable assumptions.
Findings
Proposed method yields valid estimates in simulations and real data.
Existing methods produce biased results under non-MAR missingness.
Demonstrates effectiveness of non-MAR modeling in data integration.
Abstract
In the analysis of observational data in social sciences and businesses, it is difficult to obtain a "(quasi) single-source dataset" in which the variables of interest are simultaneously observed. Instead, multiple-source datasets are typically acquired for different individuals or units. Various methods have been proposed to investigate the relationship between the variables in each dataset, e.g., matching and latent variable modeling. It is necessary to utilize these datasets as a single-source dataset with missing variables. Existing methods assume that the datasets to be integrated are acquired from the same population or that the sampling depends on covariates. This assumption is referred to as missing at random (MAR) in terms of missingness. However, as will been shown in application studies, it is likely that this assumption does not hold in actual data analysis and the results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Spectroscopy and Chemometric Analyses · Statistical Methods and Inference
MethodsGaussian Process
