Query Optimization and Evaluation via Information Theory: A Tutorial
Mahmoud Abo Khamis, Hung Q. Ngo, Dan Suciu

TL;DR
This tutorial introduces the PANDA framework, which uses information-theoretic bounds to optimize conjunctive query evaluation, matching specialized algorithms' performance through a unified, principled approach.
Contribution
It presents the PANDA framework that derives query plans from tight upper bounds on intermediate relation sizes, unifying and generalizing previous specialized algorithms.
Findings
PANDA achieves runtime performance comparable to specialized algorithms.
The framework derives query plans directly from mathematical proofs of bounds.
It unifies various algorithms under a single, principled information-theoretic approach.
Abstract
Database theory is exciting because it studies highly general and practically useful abstractions. Conjunctive query (CQ) evaluation is a prime example: it simultaneously generalizes graph pattern matching, constraint satisfaction, and statistical inference, among others. This generality is both the strength and the central challenge of the field. The query optimization and evaluation problem is fundamentally a "meta-algorithm" problem: given a query and statistics about the input database, how should one best answer ? Because the problem is so general, it is often impossible for such a meta-algorithm to match the runtimes of specialized algorithms designed for a fixed query -- or so it seemed. The past fifteen years have witnessed an exciting development in database theory: a general framework, called PANDA, that emerged from advances in database theory, constraint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
