Implementation of the twisted mass fermion operator in the QUDA library
Alexei Strelchenko, Constantia Alexandrou, Giannis Koutsou and, Alejandro Vaquero Aviles-Casco

TL;DR
This paper extends the QUDA library to efficiently implement the Wilson twisted mass fermion operator, demonstrating high-performance results on NVIDIA GPUs for both degenerate and non-degenerate flavor doublets, supporting hadron structure research.
Contribution
The paper introduces a new implementation of the twisted mass fermion operator in QUDA, optimizing performance for GPU computing in lattice QCD simulations.
Findings
Achieves up to 856 Gflops for degenerate doublets at half precision.
Reaches 879 Gflops for non-degenerate doublets at half precision.
Supports production-level calculations for hadron structure studies.
Abstract
We discuss an extension of the QUDA library for the Wilson twisted mass operator. A performance analysis is presented for both degenerate and non-degenerate flavor doublets. The degenerate twisted mass fermion operator runs at up to 190, 487 and 856 Gflops, for double, single and half precisions respectively on recent NVIDIA Kepler GPUs, while our implementation for the non-degenerate flavor doublet allows to reach 163, 516 and 879 GFlops, respectively. The code is currently in production for the hadron structure study.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · Quantum Chromodynamics and Particle Interactions · High-Energy Particle Collisions Research
