Portable multi-node LQCD Monte Carlo simulations using OpenACC
Claudio Bonati, Enrico Calore, Massimo D'Elia, Michele Mesiti, and Francesco Negro, Francesco Sanfilippo, Sebastiano Fabio Schifano, and Giorgio Silvi, Raffaele Tripiccione

TL;DR
This paper presents a portable, high-performance multi-node Lattice QCD Monte Carlo simulation code using OpenACC and OpenMPI, optimized for GPUs and CPUs, demonstrating good scalability and performance across architectures.
Contribution
The work introduces a portable multi-node LQCD Monte Carlo code leveraging OpenACC for architecture independence and details performance optimization and scaling on various processors.
Findings
High GPU performance achieved for LQCD simulations
Effective multi-node parallelization with OpenMPI
Good scalability demonstrated across architectures
Abstract
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
