Flexible Caching in Trie Joins
Oren Kalinsky, Yoav Etsion, Benny Kimelfeld

TL;DR
This paper enhances the Leapfrog Trie Join algorithm by integrating flexible caching mechanisms, balancing memory use and computation for efficient multiway join processing, demonstrated through experiments on real datasets.
Contribution
It introduces a novel caching approach into LFTJ, dynamically adjusting cache size based on join optimization and tree decomposition techniques.
Findings
Improved join performance with adaptive caching.
Reduced memory traffic during join computation.
Effective balance between memory usage and computation cost.
Abstract
Traditional algorithms for multiway join computation are based on rewriting the order of joins and combining results of intermediate subqueries. Recently, several approaches have been proposed for algorithms that are "worst-case optimal" wherein all relations are scanned simultaneously. An example is Veldhuizen's Leapfrog Trie Join (LFTJ). An important advantage of LFTJ is its small memory footprint, due to the fact that intermediate results are full tuples that can be dumped immediately. However, since the algorithm does not store intermediate results, recurring joins must be reconstructed from the source relations, resulting in excessive memory traffic. In this paper, we address this problem by incorporating caches into LFTJ. We do so by adopting recent developments on join optimization, tying variable ordering to tree decomposition. While the traditional usage of tree decomposition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Algorithms and Data Compression
