Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP
Ghazal Rahimi, Victor Lopez, Marc Clasc\`a, Joan Vinyals Ylla Catal\`a, Jesus Labarta, Marta Garcia-Gasulla

TL;DR
This paper extends efficiency metrics for heterogeneous HPC systems, implementing them in TALP to provide insights into host and device performance and guide optimization.
Contribution
It introduces a new hierarchy of efficiency metrics tailored for heterogeneous architectures, implemented within the TALP monitoring library.
Findings
Metrics reveal inefficiencies in offloading and load balancing.
Validation on benchmarks and HPC applications demonstrates actionable insights.
Extended metrics improve understanding of host-device interactions.
Abstract
The increasing adoption of heterogeneous platforms that combine CPUs with accelerators such as GPUs in high-performance computing (HPC) introduces new challenges for performance analysis and optimization. Traditional efficiency metrics, such as those proposed by the Performance Optimization and Productivity (POP) Center of Excellence, were designed primarily for homogeneous CPU-based systems and therefore, do not capture the complex interactions between host and device resources. In this work, we extend the POP efficiency framework to heterogeneous architectures by introducing a new hierarchy of metrics that separately evaluate host and device efficiency. On the host side, we quantify the effectiveness of hybrid execution and offloading operations. On the device side, we propose a multiplicative hierarchy analogous to the host hierarchy and define its Parallel Efficiency branch. Beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
