Reshi: Recommending Resources for Scientific Workflow Tasks on Heterogeneous Infrastructures
Jonathan Bader, Fabian Lehmann, Alexander Groth, Lauritz Thamsen,, Dominik Scheinert, Jonathan Will, Ulf Leser, Odej Kao

TL;DR
Reshi is a resource recommendation method for scientific workflows that improves scheduling efficiency on heterogeneous infrastructures by using regression models to predict optimal task-node assignments.
Contribution
Reshi introduces a regression-based approach for recommending task-node assignments, handling heterogeneity in both tasks and infrastructure, and demonstrates improved scheduling performance.
Findings
Reshi reduces mean makespan by up to 18% compared to HEFT.
Reshi effectively handles heterogeneous resources and tasks.
Benchmarking on AWS shows practical applicability.
Abstract
Scientific workflows typically comprise a multitude of different processing steps which often are executed in parallel on different partitions of the input data. These executions, in turn, must be scheduled on the compute nodes of the computational infrastructure at hand. This assignment is complicated by the facts that (a) tasks typically have highly heterogeneous resource requirements and (b) in many infrastructures, compute nodes offer highly heterogeneous resources. In consequence, predictions of the runtime of a given task on a given node, as required by many scheduling algorithms, are often rather imprecise, which can lead to sub-optimal scheduling decisions. We propose Reshi, a method for recommending task-node assignments during workflow execution that can cope with heterogeneous tasks and heterogeneous nodes. Reshi approaches the problem as a regression task, where task-node…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Cloud Computing and Resource Management · Machine Learning in Materials Science
