TL;DR
This paper introduces BBSched, a multi-resource scheduler for HPC systems that considers CPU and burst buffer requirements, optimizing resource utilization and improving performance through a multi-objective genetic algorithm.
Contribution
It presents a novel multi-resource scheduling scheme that explicitly optimizes multiple resources beyond CPUs using a multi-objective genetic algorithm.
Findings
BBSched improves scheduling performance by up to 41%.
Explicit multi-resource optimization enhances HPC resource utilization.
The approach effectively balances tradeoffs among CPU and burst buffer resources.
Abstract
High performance computing (HPC) is undergoing significant changes. The emerging HPC applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from emerging data-intensive applications, burst buffers are deployed in production systems. Existing HPC schedulers are mainly CPU-centric. The extreme heterogeneity of hardware devices, combined with workload changes, forces the schedulers to consider multiple resources (e.g., burst buffers) beyond CPUs, in decision making. In this study, we present a multi-resource scheduling scheme named BBSched that schedules user jobs based on not only their CPU requirements, but also other schedulable resources such as burst buffer. BBSched formulates the scheduling problem into a multi-objective optimization (MOO) problem and rapidly solves the problem using a multi-objective genetic algorithm. The multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
