The Energy Cost of Execution-Idle in GPU Clusters

Yiran Lei; Jared Fernandez; Vasilis Kypriotis; Dimitrios Skarlatos; Emma Strubell; Justine Sherry; Daniel Vosler

arXiv:2604.04745·cs.DC·April 7, 2026

The Energy Cost of Execution-Idle in GPU Clusters

Yiran Lei, Jared Fernandez, Vasilis Kypriotis, Dimitrios Skarlatos, Emma Strubell, Justine Sherry, Daniel Vosler

PDF

TL;DR

This paper investigates the high energy cost of execution-idle in GPUs within data centers, quantifies its impact, and proposes prototypes to mitigate it for improved energy efficiency.

Contribution

It characterizes execution-idle in real deployments, quantifies its energy impact, and introduces prototypes for energy reduction during this state.

Findings

01

Execution-idle accounts for 19.7% of in-execution time.

02

Execution-idle contributes to 10.7% of total energy consumption.

03

Prototypes show potential for energy savings with performance trade-offs.

Abstract

GPUs are becoming a major contributor to data center power, yet unlike CPUs, they can remain at high power even when visible activity is near zero. We call this state execution-idle. Using per-second telemetry from a large academic AI cluster, we characterize execution-idle as a recurring low-activity yet high-power state in real deployments. Across diverse workloads and multiple GPU generations, it accounts for 19.7% of in-execution time and 10.7% of energy. This suggests a need to both reduce the cost of execution-idle and reduce exposure to it. We therefore build two prototypes: one uses automatic downscaling during execution-idle, and the other uses load imbalance to reduce exposure, both with performance trade-offs. These findings suggest that future energy-efficient GPU systems should treat execution-idle as a first-class operating state.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.