Energy hardware and workload aware job scheduling towards interconnected HPC environments
Marco D'Amico, Julita Corbalan

TL;DR
This paper introduces EAMC, a job scheduling policy for heterogeneous multi-cluster HPC environments that optimizes energy consumption, response time, and makespan by predicting performance and energy use for different hardware configurations.
Contribution
The paper presents a novel energy-aware multi-cluster job scheduling policy (EAMC) that predicts performance and energy consumption to optimize resource allocation in heterogeneous HPC systems.
Findings
Up to 25% reduction in response time.
Up to 20% energy savings.
Improved cluster utilization and workload efficiency.
Abstract
New HPC machines are getting close to the exascale. Power consumption for those machines has been increasing, and researchers are studying ways to reduce it. A second trend is HPC machines' growing complexity, with increasing heterogeneous hardware components and different clusters architectures cooperating in the same machine. We refer to these environments with the term heterogeneous multi-cluster environments. With the aim of optimizing performance and energy consumption in these environments, this paper proposes an Energy-Aware-Multi-Cluster (EAMC) job scheduling policy. EAMC-policy is able to optimize the scheduling and placement of jobs by predicting performance and energy consumption of arriving jobs for different hardware architectures and processor frequencies, reducing workload's energy consumption, makespan, and response time. The policy assigns a different priority to each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
