Pattern Matching for Self- Tuning of MapReduce Jobs
Nikzad Babaii Rizvandi, Javid Taheri, Albert Y.Zomaya

TL;DR
This paper proposes a method to optimize MapReduce job execution by matching CPU utilization patterns using Dynamic Time Warping, enabling system parameter tuning based on historical application patterns.
Contribution
It introduces a pattern matching approach using DTW for self-tuning MapReduce jobs based on CPU utilization patterns, improving execution efficiency.
Findings
Effective pattern matching for CPU utilization using DTW.
Successful application to real MapReduce tasks.
Promising results on pseudo-distributed platforms.
Abstract
In this paper, we study CPU utilization time patterns of several MapReduce applications. After extracting running patterns of several applications, they are saved in a reference database to be later used to tweak system parameters to efficiently execute unknown applications in future. To achieve this goal, CPU utilization patterns of new applications are compared with the already known ones in the reference database to find/predict their most probable execution patterns. Because of different patterns lengths, the Dynamic Time Warping (DTW) is utilized for such comparison; a correlation analysis is then applied to DTWs outcomes to produce feasible similarity patterns. Three real applications (WordCount, Exim Mainlog parsing and Terasort) are used to evaluate our hypothesis in tweaking system parameters in executing similar applications. Results were very promising and showed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
