Failure Analysis of Hadoop Schedulers using an Integration of Model Checking and Simulation
Mbarka Soualhia, Foutse Khomh, Sofiene Tahar

TL;DR
This paper introduces a novel methodology combining simulation and model checking to formally verify Hadoop schedulers, enabling early detection of task failures and ensuring properties like schedulability, fairness, and deadlock-freeness.
Contribution
It presents an integrated approach using CSP and model checking to verify Hadoop scheduler properties, improving upon existing simulation and analytical methods.
Findings
Identified up to 78% task failures early in the process
Validated the methodology on OpenCloud Hadoop cluster
Demonstrated improved verification of scheduler properties
Abstract
The Hadoop scheduler is a centerpiece of Hadoop, the leading processing framework for data-intensive applications in the cloud. Given the impact of failures on the performance of applications running on Hadoop, testing and verifying the performance of the Hadoop scheduler is critical. Existing approaches such as performance simulation and analytical modeling are inadequate because they are not able to ascertain a complete verification of a Hadoop scheduler. This is due to the wide range of constraints and aspects involved in Hadoop. In this paper, we propose a novel methodology that integrates and combines simulation and model checking techniques to perform a formal verification of Hadoop schedulers, focusing on the following properties: schedulability, fairness and resources-deadlock freeness. We use the CSP language to formally describe a Hadoop scheduler, and the PAT model checker to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · Service-Oriented Architecture and Web Services
