TL;DR
This paper presents a performance-portable approach for plasma simulations, enabling PIConGPU to run efficiently across heterogeneous hardware platforms like OpenPower, CPUs, and GPUs using abstract meta-programming and Alpaka.
Contribution
It introduces a novel porting interface, cupla, that abstracts parallel programming, allowing PIConGPU to achieve high performance and portability across diverse hardware architectures.
Findings
Achieved performance portability of PIConGPU across CPUs and GPUs.
Demonstrated effective fine-grained tuning with Alpaka library.
Enabled single-source kernels for multiple hardware platforms.
Abstract
With the appearance of the heterogeneous platform OpenPower,many-core accelerator devices have been coupled with Power host processors for the first time. Towards utilizing their full potential, it is worth investigating performance portable algorithms that allow to choose the best-fitting hardware for each domain-specific compute task. Suiting even the high level of parallelism on modern GPGPUs, our presented approach relies heavily on abstract meta-programming techniques, which are essential to focus on fine-grained tuning rather than code porting. With this in mind, the CUDA-based open-source plasma simulation code PIConGPU is currently being abstracted to support the heterogeneous OpenPower platform using our fast porting interface cupla, which wraps the abstract parallel C++11 kernel acceleration library Alpaka. We demonstrate how PIConGPU can benefit from the tunable kernel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
