Grid on QPACE 4

Peter Georg; Nils Meyer; Stefan Solbrig; Tilo Wettig

arXiv:2112.01852·hep-lat·December 6, 2021

Grid on QPACE 4

Peter Georg, Nils Meyer, Stefan Solbrig, Tilo Wettig

PDF

Open Access

TL;DR

This paper discusses the deployment of QPACE 4 with 64 Fujitsu A64FX processors, focusing on porting the Grid LQCD framework to support ARM SVE, and evaluates its performance and data layout optimizations.

Contribution

The paper presents the first port of the Grid LQCD framework to ARM SVE and analyzes its performance on the QPACE 4 supercomputer.

Findings

01

Successful port of Grid to ARM SVE

02

Performance insights of Grid on QPACE 4

03

Advantages of alternative data layout for Domain Wall operator

Abstract

In 2020 we deployed QPACE 4, which features 64 Fujitsu A64FX model FX700 processors interconnected by InfiniBand EDR. QPACE 4 runs an open-source software stack. For Lattice QCD simulations we ported the Grid LQCD framework to support the ARM Scalable Vector Extension (SVE). In this contribution we discuss our SVE port of Grid, the status of SVE compilers and the performance of Grid. We also present the benefits of an alternative data layout of complex numbers for the Domain Wall operator.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems · Particle physics theoretical and experimental studies · Computational Physics and Python Applications