Analysis of Workflow Schedulers in Simulated Distributed Environments

Jakub Ber\'anek; Stanislav B\"ohm; Vojt\v{e}ch Cima

arXiv:2204.07211·cs.DC·April 18, 2022

Analysis of Workflow Schedulers in Simulated Distributed Environments

Jakub Ber\'anek, Stanislav B\"ohm, Vojt\v{e}ch Cima

PDF

2 Repos

TL;DR

This paper introduces an open-source simulation environment for benchmarking workflow schedulers in distributed systems, revealing significant impacts of realistic network modeling and implementation details on scheduler performance.

Contribution

It provides a flexible simulation platform for testing scheduling algorithms and highlights the importance of realistic network models and detailed algorithm descriptions.

Findings

01

Network models significantly affect scheduling results.

02

Implementation details greatly influence scheduler performance.

03

Realistic simulation environments are crucial for accurate benchmarking.

Abstract

Task graphs provide a simple way to describe scientific workflows (sets of tasks with dependencies) that can be executed on both HPC clusters and in the cloud. An important aspect of executing such graphs is the used scheduling algorithm. Many scheduling heuristics have been proposed in existing works; nevertheless, they are often tested in oversimplified environments. We provide an extensible simulation environment designed for prototyping and benchmarking task schedulers, which contains implementations of various scheduling algorithms and is open-sourced, in order to be fully reproducible. We use this environment to perform a comprehensive analysis of workflow scheduling algorithms with a focus on quantifying the effect of scheduling challenges that have so far been mostly neglected, such as delays between scheduler invocations or partially unknown task durations. Our results indicate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.