One Size Does NOT Fit All: On the Importance of Physical Representations for Datalog Evaluation
Nick Rassau, Felix Schuhknecht

TL;DR
This paper emphasizes the importance of selecting appropriate physical data representations in Datalog engines, demonstrating through experiments that workload-specific choices significantly impact performance and proposing an automatic selection method.
Contribution
It introduces an in-depth experimental analysis of physical representations in Datalog evaluation and develops a decision tree-based mechanism for workload-aware selection.
Findings
Physical representation choice greatly affects Datalog performance.
Workload characteristics such as relation size and recursiveness influence optimal representation.
The proposed decision tree approach effectively automates representation selection.
Abstract
Datalog is an increasingly popular recursive query language that is declarative by design, meaning its programs must be translated by an engine into the actual physical execution plan. When generating this plan, a central decision is how to physically represent all involved relations, an aspect in which existing Datalog engines are surprisingly restrictive and often resort to one-size-fits-all solutions. The reason for this is that the typical execution plan of a Datalog program not only performs a single type of operation against the physical representations, but a mixture of operations, such as insertions, lookups, and containment-checks. Further, the relevance of each operation type highly depends on the workload characteristics, which range from familiar properties such as the size, multiplicity, and arity of the individual relations to very specific Datalog properties, such as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Software Testing and Debugging Techniques · Scientific Computing and Data Management
