A mixed precision semi-Lagrangian algorithm and its performance on accelerators
Lukas Einkemmer

TL;DR
This paper introduces a mixed precision semi-Lagrangian discontinuous Galerkin algorithm that significantly reduces memory usage and enhances performance on various hardware accelerators, including Xeon Phi and NVIDIA K80.
Contribution
It presents a novel mixed precision algorithm for semi-Lagrangian methods and evaluates its efficiency across multiple hardware architectures.
Findings
Efficient implementation on diverse architectures
Significant memory reduction achieved
Noticeable performance improvements observed
Abstract
In this paper we propose a mixed precision algorithm in the context of the semi-Lagrangian discontinuous Galerkin method. The performance of this approach is evaluated on a traditional dual socket workstation as well as on a Xeon Phi and an NVIDIA K80. We find that the mixed precision algorithm can be implemented efficiently on these architectures. This implies that, in addition to the considerable reduction in memory, a substantial increase in performance can be observed as well. Moreover, we discuss the relative performance of our implementations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
