Comparing Performance and Portability between CUDA and SYCL for Protein Database Search on NVIDIA, AMD, and Intel GPUs
Manuel Costanzo, Enzo Rucci, Carlos Garc\'ia S\'anchez, Marcelo, Naiouf, Manuel Prieto-Mat\'ias

TL;DR
This study compares CUDA and SYCL for protein database search across NVIDIA, AMD, and Intel GPUs, showing SYCL's superior portability and comparable performance on NVIDIA devices.
Contribution
It demonstrates that SYCL offers better portability across diverse GPU architectures while maintaining similar performance levels to CUDA on NVIDIA GPUs.
Findings
SYCL achieves remarkable portability to AMD and Intel GPUs.
Performance on NVIDIA GPUs is similar between CUDA and SYCL.
SYCL outperforms in architectural efficiency on most tested devices.
Abstract
The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the portability and performance of the SYCL and CUDA languages for one fundamental bioinformatics application (Smith-Waterman protein database search) across different GPU architectures, considering single and multi-GPU configurations from different vendors. The experimental work showed that, while both CUDA and SYCL versions achieve similar performance on NVIDIA devices, the latter demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency rates achieved on these devices were superior in 3 of the 4 cases tested. This brief study highlights the potential of SYCL as a viable solution for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Gene expression and cancer classification · Distributed and Parallel Computing Systems
