Preliminary report: Initial evaluation of StdPar implementations on AMD GPUs for HPC
Wei-Chen Lin, Simon McIntosh-Smith, Tom Deakin

TL;DR
This paper evaluates the performance of StdPar implementations on AMD GPUs, comparing them with other models, and explores a host-side pagefault solution to improve USM performance, providing initial insights into AMD GPU support for StdPar.
Contribution
It provides the first evaluation of StdPar on AMD GPUs, compares multiple implementations, and introduces a host-side pagefault workaround for USM performance issues.
Findings
StdPar performance varies across AMD GPU implementations.
The host-side pagefault solution improves USM performance.
StdPar can be a viable model on AMD GPUs with workarounds.
Abstract
Recently, AMD platforms have not supported offloading C++17 PSTL (StdPar) programs to the GPU. Our previous work highlights how StdPar is able to achieve good performance across NVIDIA and Intel GPU platforms. In that work, we acknowledged AMD's past effort such as HCC, which unfortunately is deprecated and does not support newer hardware platforms. Recent developments by AMD, Codeplay, and AdaptiveCpp (previously known as hipSYCL or OpenSYCL) have enabled multiple paths for StdPar programs to run on AMD GPUs. This informal report discusses our experiences and evaluation of currently available StdPar implementations for AMD GPUs. We conduct benchmarks using our suite of HPC mini-apps with ports in many heterogeneous programming models, including StdPar. We then compare the performance of StdPar, using all available StdPar compilers, to contemporary heterogeneous programming models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Scientific Computing and Data Management · Distributed and Parallel Computing Systems
