Bringing Auto-tuning to HIP: Analysis of Tuning Impact and Difficulty on   AMD and Nvidia GPUs

Milo Lurati; Stijn Heldens; Alessio Sclocco; Ben van Werkhoven

arXiv:2407.11488·cs.DC·July 17, 2024

Bringing Auto-tuning to HIP: Analysis of Tuning Impact and Difficulty on AMD and Nvidia GPUs

Milo Lurati, Stijn Heldens, Alessio Sclocco, Ben van Werkhoven

PDF

Open Access 1 Repo

TL;DR

This paper introduces an auto-tuner for AMD's HIP, analyzing its impact and difficulty, and compares performance improvements on AMD and Nvidia GPUs, highlighting the importance of GPU-specific tuning.

Contribution

It extends Kernel Tuner for AMD HIP, providing the first detailed analysis of auto-tuning impact and difficulty on AMD GPUs compared to Nvidia.

Findings

01

Auto-tuning yields 10x performance improvement on AMD GPUs.

02

Auto-tuning impact is higher on AMD than Nvidia.

03

Nvidia-tuned applications perform poorly on AMD GPUs without auto-tuning.

Abstract

Many studies have focused on developing and improving auto-tuning algorithms for Nvidia Graphics Processing Units (GPUs), but the effectiveness and efficiency of these approaches on AMD devices have hardly been studied. This paper aims to address this gap by introducing an auto-tuner for AMD's HIP. We do so by extending Kernel Tuner, an open-source Python library for auto-tuning GPU programs. We analyze the performance impact and tuning difficulty for four highly-tunable benchmark kernels on four different GPUs: two from Nvidia and two from AMD. Our results demonstrate that auto-tuning has a significantly higher impact on performance on AMD compared to Nvidia (10x vs 2x). Additionally, we show that applications tuned for Nvidia do not perform optimally on AMD, underscoring the importance of auto-tuning specifically for AMD to achieve high performance on these GPUs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

milolurati/autotuning_amd_vs_nvidia_gpus
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques