# HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling

**Authors:** Matthias Maiterth, Wesley H. Brewer, Jaya S. Kuruvella, Arunavo Dey, Tanzima Z. Islam, Kevin Menear, Dmitry Duplyakin, Rashadul Kabir, Tapasya Patki, Terry Jones, Feiyi Wang

arXiv: 2508.20016 · 2025-10-03

## TL;DR

This paper introduces a pioneering digital twin framework for HPC systems that integrates scheduling, enabling pre-deployment evaluation of policies, incentives, and their effects on power and cooling infrastructure.

## Contribution

It presents the first digital twin framework with scheduling capabilities, integrating various HPC datasets and external simulators for comprehensive system evaluation.

## Key findings

- Enabled what-if scenario analysis for HPC sustainability.
- Demonstrated evaluation of incentive structures and machine learning-based scheduling.
- Provided a versatile platform for pre-deployment policy testing.

## Abstract

Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20016/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20016/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/2508.20016/full.md

---
Source: https://tomesphere.com/paper/2508.20016