Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi   Knights Landing

Issaku Kanamori; Hideo Matsufuru

arXiv:1712.01505·hep-lat·December 6, 2017·CANDAR·1 cites

Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing

Issaku Kanamori, Hideo Matsufuru

PDF

Open Access

TL;DR

This paper explores implementing lattice QCD simulations on Intel Xeon Phi Knights Landing, focusing on optimizing solver algorithms for SIMD architecture and parallel performance based on empirical measurements.

Contribution

It presents practical methods for optimizing lattice QCD code on KNL, including SIMD intrinsics and prefetching techniques, with performance tuning insights.

Findings

01

Optimized solver performance on KNL using SIMD intrinsics

02

Effective prefetching strategies for large sparse matrix operations

03

Performance tuning guidelines for SIMD and parallel architectures

Abstract

We investigate implementation of lattice Quantum Chromodynamics (QCD) code on the Intel Xeon Phi Knights Landing (KNL). The most time consuming part of the numerical simulations of lattice QCD is a solver of linear equation for a large sparse matrix that represents the strong interaction among quarks. To establish widely applicable prescriptions, we examine rather general methods for the SIMD architecture of KNL, such as using intrinsics and manual prefetching, to the matrix multiplication and iterative solver algorithms. Based on the performance measured on the Oakforest-PACS system, we discuss the performance tuning on KNL as well as the code design for facilitating such tuning on SIMD architecture and massively parallel machines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Algorithms and Data Compression