# Gyrokinetic Simulations on Many- and Multi-core Architectures with the   Global Electromagnetic Particle-In-Cell Code ORB5

**Authors:** No\'e Ohana, Claudio Gheller, Emmanuel Lanti, Andreas Jocksch, Stephan, Brunner, Laurent Villard

arXiv: 1908.02219 · 2020-02-17

## TL;DR

This paper demonstrates how the gyrokinetic code ORB5 was refactored and optimized for modern multi-core and multi-GPU HPC architectures, achieving high performance and portability across different supercomputers.

## Contribution

The paper presents a complete refactoring of ORB5 to support multi/many-core architectures with hybrid MPI, OpenMP, and OpenACC, enabling efficient plasma simulations on modern supercomputers.

## Key findings

- Achieved high performance on Summit, Piz Daint, and Marconi systems.
- Maintained code portability across diverse architectures.
- Enhanced data locality and memory access efficiency.

## Abstract

Gyrokinetic codes in plasma physics need outstanding computational resources to solve increasingly complex problems, requiring the effective exploitation of cutting-edge HPC architectures. This paper focuses on the enabling of ORB5, a state-of-the-art, first-principles-based gyrokinetic code, on modern parallel hybrid multi-core, multi-GPU systems. ORB5 is a Lagrangian, Particle-In-Cell (PIC), finite element, global, electromagnetic code, originally implementing distributed parallelism through MPI-based on domain decomposition and domain cloning. In order to support multi/many cores devices, the code has been completely refactored. Data structures have been re-designed to ensure efficient memory access, enhancing data locality. Multi-threading has been introduced through OpenMP on the CPU and adopting OpenACC to support GPU acceleration. MPI can still be used in combination with the two approaches. The performance results obtained using the full production ORB5 code on the Summit system at ORNL, on Piz Daint at CSCS and on the Marconi system at CINECA are presented, showing the effectiveness and performance portability of the adopted solutions: the same source code version was used to produce all results on all architectures.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.02219/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1908.02219/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/1908.02219/full.md

---
Source: https://tomesphere.com/paper/1908.02219