From Servers to Sites: Compositional Power Trace Generation of LLM Inference for Infrastructure Planning

Grant Wilkins; Fiodar Kazhamiaka; Ram Rajagopal

arXiv:2603.18383·cs.DC·March 20, 2026

From Servers to Sites: Compositional Power Trace Generation of LLM Inference for Infrastructure Planning

Grant Wilkins, Fiodar Kazhamiaka, Ram Rajagopal

PDF

Open Access

TL;DR

This paper presents a compositional framework for generating detailed power traces of LLM inference workloads in datacenters, enabling more accurate infrastructure planning and analysis.

Contribution

It introduces a novel trace-generation method that models LLM inference power consumption through workload transitions and configuration-specific distributions, validated across multiple settings.

Findings

01

Median absolute energy error below 5% for most configurations

02

Preserves temporal autocorrelation in generated traces

03

Supports detailed infrastructure analysis beyond static assumptions

Abstract

Datacenter operators and electrical utilities rely on power traces at different spatiotemporal scales. Operators use fine-grained traces for provisioning, facility management, and scheduling, while utilities use site-level load profiles for capacity and interconnection planning. Existing datacenter power models do not capture LLM inference workloads, in which GPUs shift rapidly among compute-intensive prefill, lower-power decode, and idle states, and facility demand depends on how these states evolve and synchronize across many devices. We show that LLM inference power can be represented compositionally through two components: workload-driven transitions among operating states and configuration-specific power distributions within those states. Building on this observation, we develop a trace-generation framework that learns from measured traces and synthesizes power profiles for new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Software System Performance and Reliability · Parallel Computing and Optimization Techniques