AVX-512 extension to OpenQCD 1.6

Ed Bennett; Mark Dawson; Michele Mesiti; Jarno Rantaharju

arXiv:1806.06043·hep-lat·November 22, 2018·1 cites

AVX-512 extension to OpenQCD 1.6

Ed Bennett, Mark Dawson, Michele Mesiti, Jarno Rantaharju

PDF

Open Access 2 Repos

TL;DR

This paper presents an extension of openQCD-1.6 utilizing AVX-512 vector instructions to enhance computational performance on Intel processors, achieving a 5-10% speedup in lattice QCD simulations.

Contribution

The paper introduces an AVX-512 extension for openQCD-1.6, optimizing data organization and operations to leverage wider vector units for improved performance.

Findings

01

Performance increased by 5-10% on Intel Knights Landing and Skylake CPUs.

02

Effective use of AVX-512 requires reorganizing data and computations.

03

Implementation demonstrates practical benefits in lattice QCD simulations.

Abstract

We publish an extension of openQCD-1.6 with AVX-512 vector instructions using Intel intrinsics. Recent Intel processors support extended instruction sets with operations on 512-bit wide vectors, increasing both the capacity for floating point operations and register memory. Optimal use of the new capabilities requires reorganising data and floating point operations into these wider vector units. We report on the implementation and performance of the AVX-512 OpenQCD extension on clusters using Intel Knights Landing and Xeon Scalable (Skylake) CPUs. In complete HMC trajectories with physically relevant parameters we observe a performance increase of 5% to 10%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Algorithms and Data Compression · Parallel Computing and Optimization Techniques