Avoiding Materialisation for Guarded Aggregate Queries
Matthias Lanzinger, Reinhard Pichler, Alexander Selzer

TL;DR
This paper introduces new optimization techniques for aggregate queries in databases that avoid materialising intermediate join results by leveraging guardedness restrictions, improving efficiency in analytical query processing.
Contribution
It proposes novel logical and physical optimization methods based on guardedness to prevent materialisation of join results in aggregate queries.
Findings
Optimizations significantly reduce intermediate result sizes.
Implementation in Spark SQL demonstrates practical efficiency gains.
Empirical evaluation shows improved query performance on standard benchmarks.
Abstract
Optimising queries with many joins is known to be a hard problem. The explosion of intermediate results as opposed to a much smaller final result poses a serious challenge to modern database management systems (DBMSs). This is particularly glaring in case of analytical queries that join many tables, but ultimately only output comparatively small aggregate information. Analogous problems are faced by graph database systems when processing analytical queries with aggregates on top of complex path queries. In this work, we propose novel optimisation techniques both, on the logical and physical level, that allow us to avoid the materialisation of join results for certain types of aggregate queries. The key to these optimisations is the notion of guardedness, by which we impose restrictions on the occurrence of attributes in GROUP BY clauses and in aggregate expressions. The efficacy of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
