TL;DR
This paper introduces GPU run-time code generation (RTCG) and presents PyCUDA and PyOpenCL, open-source toolkits that combine scripting languages with GPU performance to enhance productivity and performance in heterogeneous computing environments.
Contribution
It proposes RTCG as a simple, effective technique supported by PyCUDA and PyOpenCL, enabling custom, application-specific GPU tools and demonstrating their success through various examples.
Findings
RTCG supports custom GPU tool creation
PyCUDA and PyOpenCL enable high-level scripting with GPU performance
Demonstrated significant performance and productivity improvements
Abstract
High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
