GPU Computing with Python: Performance, Energy Efficiency and Usability
H{\aa}vard H. Holm, Andr\'e R. Brodtkorb, Martin L. S{\ae}tra

TL;DR
This paper evaluates Python-based GPU computing, analyzing performance, energy efficiency, and usability across different GPU types and programming frameworks, finding minimal Python impact and comparable CUDA and OpenCL performance.
Contribution
It provides a comprehensive comparison of CUDA and OpenCL performance and energy efficiency across various GPU generations and types, highlighting Python's negligible impact.
Findings
Python's impact on performance is negligible
CUDA and OpenCL can achieve similar performance when properly tuned
GPU performance varies more between models than between programming frameworks
Abstract
In this work, we examine the performance, energy efficiency and usability when using Python for developing HPC codes running on the GPU. We investigate the portability of performance and energy efficiency between CUDA and OpenCL; between GPU generations; and between low-end, mid-range and high-end GPUs. Our findings show that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments show that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
