Consensus Answers for Queries over Probabilistic Databases
Jian Li, Amol Deshpande

TL;DR
This paper introduces a method to find the most representative deterministic answer for queries over probabilistic databases by minimizing expected distance to possible worlds, covering various query types and providing algorithms with theoretical guarantees.
Contribution
It proposes the consensus world concept for probabilistic databases, generalizes previous models, and offers polynomial or approximation algorithms for different query types and metrics.
Findings
Polynomial-time algorithms for certain query types and metrics.
NP-hardness results for other query types and metrics.
Generalization of probabilistic database models to and/xor trees.
Abstract
We address the problem of finding a "best" deterministic query answer to a query over a probabilistic database. For this purpose, we propose the notion of a consensus world (or a consensus answer) which is a deterministic world (answer) that minimizes the expected distance to the possible worlds (answers). This problem can be seen as a generalization of the well-studied inconsistent information aggregation problems (e.g. rank aggregation) to probabilistic databases. We consider this problem for various types of queries including SPJ queries, \Topk queries, group-by aggregate queries, and clustering. For different distance metrics, we obtain polynomial time optimal or approximation algorithms for computing the consensus answers (or prove NP-hardness). Most of our results are for a general probabilistic database model, called {\em and/xor tree model}, which significantly generalizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Logic, Reasoning, and Knowledge · Data Management and Algorithms
