Semantics and Evaluation of Top-k Queries in Probabilistic Databases
Xi Zhang, Jan Chomicki

TL;DR
This paper introduces the Global-Topk semantics for top-k queries in probabilistic databases, providing efficient evaluation algorithms and addressing both simple and complex probabilistic models.
Contribution
It proposes a new semantics for top-k queries that satisfies key postulates and offers polynomial-time evaluation methods for various probabilistic database types.
Findings
Global-Topk semantics satisfies key postulates for top-k queries.
Efficient algorithms for simple probabilistic databases with linear time complexity.
Polynomial-time reductions enable evaluation in general probabilistic databases.
Abstract
We study here fundamental issues involved in top-k query evaluation in probabilistic databases. We consider simple probabilistic databases in which probabilities are associated with individual tuples, and general probabilistic databases in which, additionally, exclusivity relationships between tuples can be represented. In contrast to other recent research in this area, we do not limit ourselves to injective scoring functions. We formulate three intuitive postulates that the semantics of top-k queries in probabilistic databases should satisfy, and introduce a new semantics, Global-Topk, that satisfies those postulates to a large degree. We also show how to evaluate queries under the Global-Topk semantics. For simple databases we design dynamic-programming based algorithms, and for general databases we show polynomial-time reductions to the simple cases. For example, we demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Graph Theory and Algorithms
