Machine Learning Guided Cooling System Optimization for Data Center
Shrenik Jadhav, Zheng Liu

TL;DR
This paper introduces a physics-guided machine learning framework to identify and reduce cooling energy waste in data centers, achieving significant efficiency improvements through interpretable, safe setpoint adjustments.
Contribution
The paper presents a novel three-stage framework combining machine learning and physics constraints to quantify and recover excess cooling energy in high-performance data centers.
Findings
Achieved a mean absolute error of 0.026 MW in power prediction.
Identified approximately 85 MWh of annual cooling inefficiency.
Recovered up to 96% of excess cooling energy through setpoint adjustments.
Abstract
Effective data center cooling is crucial for reliable operation; however, cooling systems often exhibit inefficiencies that result in excessive energy consumption. This paper presents a three-stage, physics-guided machine learning framework for identifying and reducing cooling energy waste in high-performance computing facilities. Using one year of 10-minute resolution operational data from the Frontier exascale supercomputer, we first train a monotonicity-constrained gradient boosting surrogate that predicts facility accessory power from coolant flow rates, temperatures, and server power. The surrogate achieves a mean absolute error of 0.026 MW and predicts power usage effectiveness within 0.01 of measured values for 98.7% of test samples. In the second stage, the surrogate serves as a physics-consistent baseline to quantify excess cooling energy, revealing approximately 85 MWh of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHeat Transfer and Optimization · Cloud Computing and Resource Management · Parallel Computing and Optimization Techniques
