A Novel Approach to Translate Structural Aggregation Queries to MapReduce Code
Ahmed M. Abdelmoniem, Sameh Abdulah, Walid Atwa

TL;DR
This paper introduces a system that translates array queries with complex structural aggregations from SciDB's AQL into optimized MapReduce code, significantly improving performance and reducing development effort.
Contribution
It presents a novel translator for array queries with structural aggregations, bridging SciDB and MapReduce, and demonstrating performance improvements over handwritten code.
Findings
Generated MapReduce code is up to 10.84x faster than handwritten implementations.
Supports various complex array aggregations like circular, grid, hierarchical, and sliding.
Reduces user effort in developing MapReduce applications for array data.
Abstract
Data management applications are growing and require more attention, especially in the "big data" era. Thus, supporting such applications with novel and efficient algorithms that achieve higher performance is critical. Array database management systems are one way to support these applications by dealing with data represented in n-dimensional data structures. For instance, software like SciDB and RasDaMan can be powerful tools to achieve the required performance on large-scale problems with multidimensional data. Like their relational counterparts, these management systems support specific array query languages as the user interface. As a popular programming model, MapReduce allows large-scale data analysis, facilitates query processing, and is used as a DB engine. Nevertheless, one major obstacle is the low productivity of developing MapReduce applications. Unlike high-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
