Hierarchical Multi-Agent Framework for Carbon-Efficient Liquid-Cooled Data Center Clusters
Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Vineet Gundecha,, Ricardo Luna Gutierrez, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh, Babu, Desik Rengarajan, Cullen Bash

TL;DR
This paper presents Green-DCC, a hierarchical RL-based framework that optimizes workload distribution and liquid cooling in data center clusters to reduce carbon emissions and enhance sustainability.
Contribution
It introduces a novel hierarchical RL controller that dynamically manages workload and cooling, considering weather, carbon intensity, and resource constraints, with a benchmark simulation framework.
Findings
Green-DCC reduces carbon emissions effectively.
Hierarchical RL outperforms other approaches in sustainability metrics.
Framework enables synchronized optimization across multiple data centers.
Abstract
Reducing the environmental impact of cloud computing requires efficient workload distribution across geographically dispersed Data Center Clusters (DCCs) and simultaneously optimizing liquid and air (HVAC) cooling with time shift of workloads within individual data centers (DC). This paper introduces Green-DCC, which proposes a Reinforcement Learning (RL) based hierarchical controller to optimize both workload and liquid cooling dynamically in a DCC. By incorporating factors such as weather, carbon intensity, and resource availability, Green-DCC addresses realistic constraints and interdependencies. We demonstrate how the system optimizes multiple data centers synchronously, enabling the scope of digital twins, and compare the performance of various RL approaches based on carbon emissions and sustainability metrics while also offering a framework and benchmark simulation for broader ML…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Simulation Techniques and Applications
