Part-time Power Measurements: nvidia-smi's Lack of Attention
Zeyu Yang, Karel Adamek, Wesley Armour

TL;DR
This study critically examines nvidia-smi's GPU power readings, revealing significant inaccuracies and sampling issues across various GPU architectures, and proposes practices to improve energy measurement accuracy.
Contribution
It provides a detailed analysis of nvidia-smi's power measurement mechanisms, identifies key inaccuracies, and offers mitigation strategies to enhance energy consumption estimation.
Findings
nvidia-smi samples only 25% of runtime on some GPUs
significant discrepancies found between nvidia-smi and external power meters
proposed practices reduce energy measurement error by up to 65%
Abstract
The GPU has emerged as the go-to accelerator for high throughput and parallel workloads, spanning scientific simulations to AI, thanks to its performance and power efficiency. Given that 6 out of the top 10 fastest supercomputers in the world use NVIDIA GPUs and many AI companies each employ 10,000's of NVIDIA GPUs, an accurate understanding of GPU power consumption is essential for making progress to further improve its efficiency. Despite the limited documentation and the lack of understanding of its mechanisms, NVIDIA GPUs' built-in power sensor, providing easily accessible power readings via the nvidia-smi interface, is widely used in energy efficient computing research on GPUs. Our study seeks to elucidate the internal mechanisms of the power readings provided by nvidia-smi and assess the accuracy of the power and energy consumption data. We have developed a suite of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Green IT and Sustainability · Cloud Computing and Resource Management
