Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms for Multi/Many-Core Architectures
Fabio Baruffa, Luigi Iapichino, Nicolay J. Hammer, Vasileios Karakasis

TL;DR
This paper presents a set of code optimisation strategies for Smoothed Particle Hydrodynamics algorithms on multi/many-core architectures, significantly improving performance and scalability across various Intel hardware.
Contribution
The work introduces a comprehensive code modernisation approach for SPH algorithms, including threading, data layout, auto-vectorisation, and algorithmic improvements, demonstrating substantial performance gains.
Findings
2.6x speedup on Ivy Bridge
13.7x speedup on Knights Corner
19.1x speedup on Knights Landing
Abstract
We describe a strategy for code modernisation of Gadget, a widely used community code for computational astrophysics. The focus of this work is on node-level performance optimisation, targeting current multi/many-core IntelR architectures. We identify and isolate a sample code kernel, which is representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm. The code modifications include threading parallelism optimisation, change of the data layout into Structure of Arrays (SoA), auto-vectorisation and algorithmic improvements in the particle sorting. We obtain shorter execution time and improved threading scalability both on Intel XeonR ( on Ivy Bridge) and Xeon PhiTM ( on Knights Corner) systems. First few tests of the optimised code result in faster execution on second generation Xeon Phi (Knights Landing), thus demonstrating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
