# Optimal Joins using Compact Data Structures

**Authors:** Gonzalo Navarro, Juan L. Reutter, Javiel Rojas-Ledesma

arXiv: 1908.01812 · 2020-01-10

## TL;DR

This paper introduces a storage-efficient, worst-case optimal join algorithm using compact data structures like quadtrees and qdags, eliminating the need for additional indexes and enabling efficient evaluation of complex queries.

## Contribution

It presents a novel approach that achieves worst-case optimal join processing directly from compact point-set representations without extra storage overhead.

## Key findings

- The proposed method is worst-case optimal in data complexity.
- It uses compact quadtrees and qdags for static and dynamic indexing.
- The framework extends to evaluate expressive relational algebra queries.

## Abstract

Worst-case optimal join algorithms have gained a lot of attention in the database literature. We now count with several algorithms that are optimal in the worst case, and many of them have been implemented and validated in practice. However, the implementation of these algorithms often requires an enhanced indexing structure: to achieve optimality we either need to build completely new indexes, or we must populate the database with several instantiations of indexes such as B$+$-trees. Either way, this means spending an extra amount of storage space that may be non-negligible.   We show that optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of extra storage. Our representation is a compact quad tree for the static indexes, and a dynamic quadtree sharing subtrees (which we dub a qdag) for intermediate results. We develop a compositional algorithm to process full join queries under this representation, and show that the running time of this algorithm is worst-case optimal in data complexity. Remarkably, we can extend our framework to evaluate more expressive queries from relational algebra by introducing a lazy version of qdags (lqdags). Once again, we can show that the running time of our algorithms is worst-case optimal.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.01812/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/1908.01812/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1908.01812/full.md

---
Source: https://tomesphere.com/paper/1908.01812