AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models
Martin Mayr, Sebastian Wind, Lukas Schr\"oder, Georg Hager, Harald K\"ostler, Gerhard Wellein

TL;DR
This paper presents a benchmarking framework for AI workloads focusing on power-aware performance analysis across vision and language models, revealing diverse energy efficiency trade-offs on different GPU architectures.
Contribution
It introduces a new benchmarking framework for AI workloads that evaluates performance and energy efficiency under power capping scenarios, highlighting architecture-specific trade-offs.
Findings
No universal optimal power cap for all applications and GPUs.
Different GPU architectures exhibit distinct performance-energy trade-offs.
The framework will be publicly released for broader use.
Abstract
Artificial Intelligence (AI) workloads drive a rapid expansion of high-performance computing (HPC) infrastructures and increase their power and energy demands towards a critical level. AI benchmarks representing state-of-the art workloads and their understanding in the context of performance-energy trade-offs are critical to deploy efficient infrastructures and can guide energy efficiency measures, such as power capping. We introduce a benchmarking framework with popular deep learning applications from computer vision (image classification and generation) and large language models (continued pre-training and inference) implementing modern methods. Our performance analysis focuses on throughput rather than time to "completion", which is the standard metric in HPC. We analyse performance and energy efficiency under various power capping scenarios on NVIDIA H100, NVIDIA H200, and AMD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
