Recoverability of Joint Distribution from Missing Data
Jin Tian

TL;DR
This paper introduces an algorithm to determine if a joint probability distribution can be estimated from incomplete data modeled by Bayesian networks with missing data mechanisms, advancing theoretical understanding and practical capabilities.
Contribution
It presents a systematic algorithm for assessing estimability of joint distributions from missing data within Bayesian network models, including missingness mechanisms.
Findings
Algorithm effectively determines estimability of joint distributions.
Advances theoretical understanding of missing data mechanisms.
Applicable to complex Bayesian network models.
Abstract
A probabilistic query may not be estimable from observed data corrupted by missing values if the data are not missing at random (MAR). It is therefore of theoretical interest and practical importance to determine in principle whether a probabilistic query is estimable from missing data or not when the data are not MAR. We present an algorithm that systematically determines whether the joint probability is estimable from observed data with missing values, assuming that the data-generation model is represented as a Bayesian network containing unobserved latent variables that not only encodes the dependencies among the variables but also explicitly portrays the mechanisms responsible for the missingness process. The result significantly advances the existing work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Management and Algorithms · Data Quality and Management
