TL;DR
This paper introduces pocl, an open-source OpenCL implementation that achieves both platform and performance portability across diverse hardware architectures, utilizing a modular kernel compiler and LLVM IR metadata.
Contribution
It presents a novel, modular kernel compiler for OpenCL that retains data parallelism information for efficient cross-platform performance optimization.
Findings
Most benchmarked applications run faster or comparably to proprietary implementations.
Supports a wide range of architectures, including future research platforms.
Enables portable and efficient OpenCL execution across heterogeneous systems.
Abstract
OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
