Aging-aware CPU Core Management for Embodied Carbon Amortization in   Cloud LLM Inference

Tharindu B. Hewage; Shashikant Ilager; Maria Rodriguez Read; Rajkumar; Buyya

arXiv:2501.15829·cs.DC·January 28, 2025

Aging-aware CPU Core Management for Embodied Carbon Amortization in Cloud LLM Inference

Tharindu B. Hewage, Shashikant Ilager, Maria Rodriguez Read, Rajkumar, Buyya

PDF

Open Access 1 Repo

TL;DR

This paper introduces an aging-aware CPU core management method for cloud LLM inference that extends CPU lifespan, reduces embodied carbon emissions by 37.67%, and maintains service quality with minimal impact.

Contribution

It presents a novel technique leveraging CPU underutilization patterns to delay aging effects, enabling longer CPU use and lower embodied carbon in cloud LLM inference clusters.

Findings

01

37.67% reduction in embodied carbon emissions

02

77% decrease in CPU underutilization

03

Less than 10% impact on inference service quality

Abstract

Broad adoption of Large Language Models (LLM) demands rapid expansions of cloud LLM inference clusters, leading to accumulation of embodied carbon $-$ the emissions from manufacturing and supplying IT assets $-$ that mostly concentrate on inference server CPU. This paper delves into the challenges of sustainable growth of cloud LLM inference, emphasizing extended amortization of CPU embodied over an increased lifespan. Given the reliability risks of silicon aging, we propose an aging-aware CPU core management technique to delay CPU aging effects, allowing the cluster operator to safely increase CPU life. Our technique exploits CPU underutilization patterns that we uncover in cloud LLM inference by halting aging in unused cores and even-outing aging in active cores via selective deep idling and aging-aware inference task allocation. Through extensive simulations using real-world Azure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tharindu-b-hewage/splitwise-sim-cpu-carbon
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management