Reproducibility, accuracy and performance of the Feltor code and library   on parallel computer architectures

Matthias Wiesenberger; Lukas Einkemmer; Markus Held; Albert; Gutierrez-Milla; Xavier Saez; Roman Iakymchuk

arXiv:1807.01971·physics.comp-ph·March 27, 2019·Comput. Phys. Commun.

Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures

Matthias Wiesenberger, Lukas Einkemmer, Markus Held, Albert, Gutierrez-Milla, Xavier Saez, Roman Iakymchuk

PDF

TL;DR

This paper evaluates the Feltor scientific software library's reproducibility, accuracy, and performance on various parallel architectures, addressing non-determinism, optimizing performance, and modeling execution time.

Contribution

It introduces methods to achieve bitwise reproducibility in parallel simulations and develops a performance model predicting execution times across hardware.

Findings

01

Reproducibility achieved using long accumulator dot products.

02

Performance model predicts execution time with <25% error.

03

Minimum array size for efficient scaling identified.

Abstract

Feltor is a modular and free scientific software package. It allows developing platform independent code that runs on a variety of parallel computer architectures ranging from laptop CPUs to multi-GPU distributed memory systems. Feltor consists of both a numerical library and a collection of application codes built on top of the library. Its main target are two- and three-dimensional drift- and gyro-fluid simulations with discontinuous Galerkin methods as the main numerical discretization technique. We observe that numerical simulations of a recently developed gyro-fluid model produce non-deterministic results in parallel computations. First, we show how we restore accuracy and bitwise reproducibility algorithmically and programmatically. In particular, we adopt an implementation of the exactly rounded dot product based on long accumulators, which avoids accuracy losses especially in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.