CuPBoP: CUDA for Parallelized and Broad-range Processors
Ruobing Han, Jun Chen, Bhanu Garg, Jeffrey Young, Jaewoong Sim, and, Hyesoon Kim

TL;DR
CuPBoP enables CUDA programs to run on non-NVIDIA hardware without manual code modifications, achieving high coverage and performance across various architectures, thus broadening CUDA's applicability in heterogeneous systems.
Contribution
CuPBoP introduces a novel approach to execute CUDA on non-NVIDIA devices without manual source code modifications, surpassing existing frameworks in coverage and supporting multiple ISAs with competitive performance.
Findings
Achieves 69.6% coverage on Rodinia benchmark, higher than existing frameworks.
Supports multiple CPU ISAs like X86, RISC-V, AArch64 with high performance.
Comparable or superior performance to manually optimized OpenMP/MPI and CUDA programs.
Abstract
CUDA is one of the most popular choices for GPU programming, but it can only be executed on NVIDIA GPUs. Executing CUDA on non-NVIDIA devices not only benefits the hardware community, but also allows data-parallel computation in heterogeneous systems. To make CUDA programs portable, some researchers have proposed using source-to-source translators to translate CUDA to portable programming languages that can be executed on non-NVIDIA devices. However, most CUDA translators require additional manual modifications on the translated code, which imposes a heavy workload on developers. In this paper, CuPBoP is proposed to execute CUDA on non-NVIDIA devices without relying on any portable programming languages. Compared with existing work that executes CUDA on non-NVIDIA devices, CuPBoP does not require manual modification of the CUDA source code, but it still achieves the highest coverage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies
