SIMD Vectorization for the Lennard-Jones Potential with AVX2 and AVX-512   instructions

Hiroshi Watanabe; Koh M. Nakagawa

arXiv:1806.05713·cs.MS·February 20, 2019

SIMD Vectorization for the Lennard-Jones Potential with AVX2 and AVX-512 instructions

Hiroshi Watanabe, Koh M. Nakagawa

PDF

1 Repo

TL;DR

This paper explores SIMD vectorization of Lennard-Jones force calculations using AVX2 and AVX-512, demonstrating significant performance improvements and analyzing data layout and optimization techniques across different Intel architectures.

Contribution

It provides a detailed analysis of data layout choices and optimization strategies for vectorizing Lennard-Jones force calculations with AVX2 and AVX-512.

Findings

01

AoS with padding outperforms SoA in vectorization.

02

Vectorization improves performance by up to 42%.

03

Performance varies across architectures and vectorization methods.

Abstract

This work describes the SIMD vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method involves indirect access to memory, the data layout is one of the most important factors in vectorization. We find that the Array of Structures (AoS) with padding exhibits better performance than Structure of Arrays (SoA) with appropriate vectorization and optimizations. In particular, AoS with 512-bit width exhibits the best performance among the architectures. While the difference in performance between AoS and SoA is significant for the vectorization with AVX2, that with AVX-512 is minor. The effect of other optimization techniques, such as software pipelining together with vectorization, is also discussed. We present results for benchmarks on three CPU architectures: Intel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaityo256/lj_simd
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.