Towards Lattice Quantum Chromodynamics on FPGA devices
Grzegorz Korcyl, Piotr Korcyl

TL;DR
This paper presents a FPGA-based implementation of the Conjugate Gradient algorithm for Lattice Quantum Chromodynamics, demonstrating comparable performance to CPUs and accelerators, with potential for power-efficient HPC systems.
Contribution
It introduces a novel FPGA implementation of the Dirac operator inversion in Lattice QCD, separating hardware and software for optimized performance.
Findings
FPGA implementation achieves performance comparable to CPUs and Xeon Phi accelerators.
Hardware acceleration of Dirac operator multiplication enhances computational efficiency.
Power-efficient FPGA-based HPC systems are feasible for Lattice QCD simulations.
Abstract
In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) implementation of the Conjugate Gradient algorithm in the context of Lattice Quantum Chromodynamics. As a benchmark of our proposal we invert numerically the Dirac-Wilson operator on a 4-dimensional grid on three Xilinx hardware solutions: Zynq Ultrascale+ evaluation board, the Alveo U250 accelerator and the largest device available on the market, the VU13P device. In our implementation we separate software/hardware parts in such a way that the entire multiplication by the Dirac operator is performed in hardware, and the rest of the algorithm runs on the host. We find out that the FPGA implementation can offer a performance comparable with that obtained using current CPU or Intel's many core Xeon Phi accelerators. A possible multiple node FPGA-based system is discussed and we argue that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
