SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

Nuno Cardoso; Pedro Bicudo

arXiv:1010.4834·hep-lat·March 17, 2015

SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

Nuno Cardoso, Pedro Bicudo

PDF

TL;DR

This paper evaluates CUDA GPU performance for SU(2) lattice gauge theory simulations, demonstrating significant speedups over CPU and analyzing precision impacts across architectures.

Contribution

It provides optimized GPU codes for SU(2) lattice simulations and compares performance across architectures and precisions, highlighting efficiency gains.

Findings

01

200x speedup with two Fermi GPUs over one CPU in single precision

02

Double precision computations are less than twice as slow as single precision on Fermi architecture

03

GPU implementations outperform CPU in lattice gauge theory simulations

Abstract

In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU(2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations ( $50000$ ) without smearing and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.