Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark
Elham Azhir, Mehdi Hosseinzadeh, Faheem Khan, Amir Mosavi

TL;DR
This paper evaluates the performance of query plan recommendation using Apache Hadoop and Apache Spark, demonstrating that Spark provides significantly faster clustering of query datasets, enhancing scalability in large-scale query optimization.
Contribution
It introduces a MapReduce-based clustering approach for query plan recommendation and compares the performance of Hadoop and Spark frameworks.
Findings
Spark outperforms Hadoop with an average 2x speedup.
Parallel query clustering improves scalability for large datasets.
Efficient clustering reduces query optimization time.
Abstract
Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method. However, traditional clustering algorithms take a significant amount of execution time for clustering such large datasets. The MapReduce distributed computing model provides efficient solutions for storing and processing vast quantities of data. Apache Spark and Apache Hadoop frameworks are used in the present investigation to cluster different sizes of query datasets in the MapReduce-based access plan recommendation method. The performance evaluation is performed based on execution time. The results of the experiments demonstrated the effectiveness of parallel query clustering in achieving high scalability. Furthermore, Apache Spark achieved better performance than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Caching and Content Delivery
