Code Optimization on Kepler GPUs and Xeon Phi
Yong-Chull Jang, Hwancheol Jeong, Jangho Kim, Weonjong Lee, Jeonghwan, Pak, Yuree Chung (SWME Collaboration)

TL;DR
This paper evaluates and compares the performance of Kepler GPUs and Xeon Phi coprocessors for lattice QCD computations, demonstrating significant speedups with updated libraries on GPUs but inferior performance of Xeon Phi.
Contribution
The study updates lattice QCD code with recent libraries and provides a performance comparison between Kepler GPUs and Xeon Phi coprocessors.
Findings
Updated libraries doubled the CG inverter speed on GPUs.
Kepler GPUs outperform Xeon Phi in lattice QCD tasks.
Xeon Phi's performance is significantly lower than Kepler GPUs.
Abstract
Kepler GTX Titan Black and Kepler Tesla K40 are still the best GPUs for high performance computing, although Maxwell GPUs such as GTX 980 are available in the market. Hence, we measure the performance of our lattice QCD codes using the Kepler GPUs. We also upgrade our code to use the latest CPS (Columbia Physics System) library along with the most recent QUDA (QCD CUDA) library for lattice QCD. These new libraries improve the performance of our conjugate gradient (CG) inverter so that it runs twice faster than before. We also investigate the performance of Xeon Phi 7120P coprocessor. It has similar computing power with the Kepler GPUs in principle. However, its performance for our CG code is significantly inferior to that of the GTX Titan Black GPUs at present.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Physics of Superconductivity and Magnetism
