What do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog have to do with one another?
Mahmoud Abo Khamis, Hung Q. Ngo, Dan Suciu

TL;DR
This paper establishes a deep connection between information theory, database query bounds, and algorithm design by extending bounds to disjunctive datalog and introducing the PANDA algorithm, which matches submodular width runtimes.
Contribution
It introduces Shannon flow inequalities and proof sequences to derive tight bounds and algorithms for disjunctive datalog, generalizing previous results to broader query classes.
Findings
PANDA algorithm matches submodular width runtimes.
Bounds apply to disjunctive datalog with functional dependencies.
Improves runtime bounds over previous algorithms.
Abstract
Recent works on bounding the output size of a conjunctive query with functional dependencies and degree constraints have shown a deep connection between fundamental questions in information theory and database theory. We prove analogous output bounds for disjunctive datalog rules, and answer several open questions regarding the tightness and looseness of these bounds along the way. Our bounds are intimately related to Shannon-type information inequalities. We devise the notion of a "proof sequence" of a specific class of Shannon-type information inequalities called "Shannon flow inequalities". We then show how such a proof sequence can be interpreted as symbolic instructions guiding an algorithm called "PANDA", which answers disjunctive datalog rules within the time that the size bound predicted. We show that PANDA can be used as a black-box to devise algorithms matching precisely the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
