Racing to Idle: Energy Efficiency of Matrix Multiplication on Heterogeneous CPU and GPU Architectures
Mufakir Qamar Ansari (1), Mudabir Qamar Ansari (2) ((1) Department of Electrical Engineering, Computer Science, The University of Toledo, Toledo, OH, USA, (2) Department of School of Accounting, Information Systems, Lamar University, Beaumont, TX, USA)

TL;DR
This study empirically compares the performance and energy efficiency of matrix multiplication across CPU, discrete GPU, and integrated GPU on a laptop, highlighting the discrete GPU's superior energy efficiency and performance.
Contribution
It provides the first direct measurement of energy and performance trade-offs for matrix multiplication on heterogeneous consumer hardware using accessible tools.
Findings
Discrete GPU achieves 93.5x speedup over CPU.
GPU consumes only 2% of CPU energy for the workload.
GPU offers 50-fold improvement in energy efficiency.
Abstract
The paradigm shift towards multi-core and heterogeneous computing, driven by the fundamental power and thermal limits of single-core processors, has established energy efficiency as a first-class design constraint in high-performance computing (HPC). Heterogeneous systems, integrating traditional multi-core CPUs with specialized accelerators like discrete (dGPU) and integrated (iGPU) graphics processing units, offer a compelling path to navigating the trade-offs between performance and power. However, quantifying these trade-offs on widely accessible hardware remains a critical area of study. This paper presents a direct, empirical measurement of the performance and energy-to-solution of a canonical HPC workload -- a 4096x4096 matrix-matrix multiplication -- on three distinct compute architectures within a single consumer-grade laptop: a multi-core AMD Ryzen 7 5800H CPU, a discrete…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
