Portable Lattice QCD implementation based on OpenCL

Piyush Kumar; Szabolcs Borsanyi; Jana N. Guenther; Chik Him Wong

arXiv:2502.03249·hep-lat·February 6, 2025

Portable Lattice QCD implementation based on OpenCL

Piyush Kumar, Szabolcs Borsanyi, Jana N. Guenther, Chik Him Wong

PDF

Open Access

TL;DR

This paper presents a portable OpenCL implementation of Lattice QCD, benchmarking its performance across different GPU architectures and comparing it with CUDA-based implementations to evaluate portability and efficiency.

Contribution

The authors developed an OpenCL backend for Lattice QCD simulations, enabling cross-architecture compatibility and providing performance benchmarks against CUDA implementations.

Findings

01

OpenCL implementation performs comparably to CUDA on Nvidia GPUs

02

Significant portability achieved across AMD and Nvidia hardware

03

Benchmark results demonstrate the viability of OpenCL for Lattice QCD

Abstract

The presence of GPU from different vendors demands the Lattice QCD codes to support multiple architectures. To this end, Open Computing Language (OpenCL) is one of the viable frameworks for writing a portable code. It is of interest to find out how the OpenCL implementation performs as compared to the code based on a dedicated programming interface such as CUDA for Nvidia GPUs. We have developed an OpenCL backend for our already existing code of the Wuppertal-Budapest collaboration. In this contribution, we show benchmarks of the most time consuming part of the numerical simulation, namely, the inversion of the Dirac operator. We present the code performance on the JUWELS and LUMI Supercomputers based on Nvidia and AMD graphics cards, respectively, and compare with the CUDA backend implementation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Scientific Computing and Data Management