AI Greenferencing: Routing AI Inferencing to Green Modular Data Centers with Heron
Tella Rajashekhar Reddy, Palak, Rohan Gandhi, Anjaly Parayil, Chaojie Zhang, Mike Shepperd, Liangcheng Yu, Jayashree Mohan, Srinivasan Iyengar, Shivkumar Kalyanaraman, Debopam Bhattacherjee

TL;DR
This paper introduces Heron, a software router that optimizes AI inferencing workloads across wind farm data centers, significantly increasing compute efficiency by leveraging green energy and workload routing.
Contribution
It presents a novel cross-site routing system, Heron, that enhances AI compute efficiency by intelligently routing workloads to utilize green power from wind farms.
Findings
Heron improves AI compute goodput by up to 80%.
Routing AI workloads to wind farms reduces energy costs and increases green energy utilization.
The deployment strategy enables the use of over 6 million GPUs powered by renewable energy.
Abstract
AI power demand is growing unprecedentedly thanks to the high power density of AI compute and the emerging inferencing workload. On the supply side, abundant wind power is waiting for grid access in interconnection queues. In this light, this paper argues bringing AI workload to modular compute clusters co-located in wind farms. Our deployment right-sizing strategy makes it economically viable to deploy more than 6 million high-end GPUs today that could consume cheap, green power at its source. We built Heron, a cross-site software router, that could efficiently leverage the complementarity of power generation across wind farms by routing AI inferencing workload around power drops. Using 1-week ofcoding and conversation production traces from Azure and (real) variable wind power traces, we show how Heron improves aggregate goodput of AI compute by up to 80% compared to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence · IoT and Edge/Fog Computing · Digital Transformation in Industry
