Database Querying under Missing Values Governed by Missingness Mechanisms
Leopoldo Bertossi, Farouk Toumani, Maxime Buron

TL;DR
This paper develops a probabilistic framework for query answering in relational databases with missing values, modeled via Bayesian networks, capturing uncertainty and imputation plausibility.
Contribution
It introduces a novel semantics and query answering techniques for databases with missing values governed by Bayesian network models, differing from NULL-based approaches.
Findings
Constructs a block-independent probabilistic database from missingness mechanisms.
Proposes two query answering techniques capturing uncertainty and plausibility.
Provides complexity results for the computational feasibility of the approaches.
Abstract
We address the problems of giving a semantics to- and doing query answering (QA) on a relational database (RDB) that has missing values (MVs). The causes for the latter are governed by a Missingness Mechanism that is modelled as a Bayesian Network, which represents a Missingness Graph (MG) and involves the DB attributes. Our approach considerable departs from the treatment of RDBs with NULL (values). The MG together with the observed DB allow to build a block-independent probabilistic DB, on which basis we propose two QA techniques that jointly capture probabilistic uncertainty and statistical plausibility of the implicit imputation of MVs. We obtain complexity results that characterize the computational feasibility of those approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
