Optimization for Speculative Execution of Multiple Jobs in a MapReduce-like Cluster
Huanle Xu, Wing Cheong Lau

TL;DR
This paper develops and analyzes optimization-based speculative execution schemes for MapReduce-like clusters, significantly reducing job delays under various load conditions by intelligently launching duplicate tasks.
Contribution
It introduces the Smart Cloning Algorithm and Straggler Detection Algorithm for light loads, and an Enhanced Speculative Execution method for heavy loads, improving efficiency and reducing delays.
Findings
SCA and SDA reduce job flowtime by nearly 60% compared to Microsoft Mantri.
ESE algorithm outperforms Mantri by 18% in job flowtime under heavy load.
Proposed schemes balance resource use and delay reduction effectively.
Abstract
Nowadays, a computing cluster in a typical data center can easily consist of hundreds of thousands of commodity servers, making component/ machine failures the norm rather than exception. A parallel processing job can be delayed substantially as long as one of its many tasks is being assigned to a failing machine. To tackle this so-called straggler problem, most parallel processing frameworks such as MapReduce have adopted various strategies under which the system may speculatively launch additional copies of the same task if its progress is abnormally slow or simply because extra idling resource is available. In this paper, we focus on the design of speculative execution schemes for a parallel processing cluster under different loading conditions. For the lightly loaded case, we analyze and propose two optimization-based schemes, namely, the Smart Cloning Algorithm (SCA) which is based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Graph Theory and Algorithms
