A mixed precision semi-Lagrangian algorithm and its performance on   accelerators

Lukas Einkemmer

arXiv:1603.07008·cs.MS·August 14, 2018

A mixed precision semi-Lagrangian algorithm and its performance on accelerators

Lukas Einkemmer

PDF

TL;DR

This paper introduces a mixed precision semi-Lagrangian discontinuous Galerkin algorithm that significantly reduces memory usage and enhances performance on various hardware accelerators, including Xeon Phi and NVIDIA K80.

Contribution

It presents a novel mixed precision algorithm for semi-Lagrangian methods and evaluates its efficiency across multiple hardware architectures.

Findings

01

Efficient implementation on diverse architectures

02

Significant memory reduction achieved

03

Noticeable performance improvements observed

Abstract

In this paper we propose a mixed precision algorithm in the context of the semi-Lagrangian discontinuous Galerkin method. The performance of this approach is evaluated on a traditional dual socket workstation as well as on a Xeon Phi and an NVIDIA K80. We find that the mixed precision algorithm can be implemented efficiently on these architectures. This implies that, in addition to the considerable reduction in memory, a substantial increase in performance can be observed as well. Moreover, we discuss the relative performance of our implementations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.