Faster Query Answering in Probabilistic Databases using Read-Once Functions
Sudeepa Roy, Vittorio Perduca, Val Tannen

TL;DR
This paper introduces an efficient method for determining and computing read-once forms of boolean event expressions in probabilistic databases, improving over previous approaches by avoiding disjunctive normal form conversions.
Contribution
The paper presents a novel, efficient algorithm for directly computing read-once forms of boolean expressions generated by conjunctive queries without self-joins, using the new concept of co-table graphs.
Findings
Efficient computation of co-occurrence graphs from provenance graphs.
Successful direct computation of read-once forms using co-table graphs.
Applicable to tuple-independent probabilistic databases with specific query structures.
Abstract
A boolean expression is in read-once form if each of its variables appears exactly once. When the variables denote independent events in a probability space, the probability of the event denoted by the whole expression in read-once form can be computed in polynomial time (whereas the general problem for arbitrary expressions is #P-complete). Known approaches to checking read-once property seem to require putting these expressions in disjunctive normal form. In this paper, we tell a better story for a large subclass of boolean event expressions: those that are generated by conjunctive queries without self-joins and on tuple-independent probabilistic databases. We first show that given a tuple-independent representation and the provenance graph of an SPJ query plan without self-joins, we can, without using the DNF of a result event expression, efficiently compute its co-occurrence graph.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Data Quality and Management
