Optimizing Relational Queries over Array-Valued Data in Columnar Systems
Maroua Zeblah (TYREX), Etienne Couritas, Sarah Chlyah (TYREX), Pierre Genev\`es (TYREX), Nils Gesbert (TYREX), Nabil Laya\"ida (TYREX)

TL;DR
This paper presents A3D-RA, an extended relational algebra for array-valued data, with optimization techniques that improve query performance in columnar database systems handling mixed relational and array data.
Contribution
It introduces a formal algebraic framework and optimization strategies for relational queries involving array attributes, enhancing performance in analytical systems.
Findings
Consistent performance improvements across three database engines.
A complete set of equivalence-preserving transformation rules.
Polynomial-time plan enumeration with optimality guarantees.
Abstract
Modern analytical workloads increasingly combine relational data with array-valued attributes. While columnar database systems efficiently process such workloads, their ability to optimize queries that interleave relational operators with array manipulations remains limited. This paper introduces A3D-RA, an extended relational algebra supporting array-valued attributes, together with a comprehensive framework for algebraic reasoning and optimization. We formalize its data model and semantics, develop a complete set of equivalence-preserving transformation rules capturing pairwise interactions between relational and array operators, and propose a plan enumeration strategy with an optimality guarantee that remains polynomial in all non-join operators. We design A3D-RA as a modular, backend-independent optimization layer that can be instantiated over existing analytical database systems.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
