Nodal Discontinuous Galerkin Methods on Graphics Processors

Andreas Kl\"ockner; Tim Warburton; Jeffrey Bridge; Jan S. Hesthaven

arXiv:0901.1024·math.NA·November 18, 2009·J. Comput. Phys.

Nodal Discontinuous Galerkin Methods on Graphics Processors

Andreas Kl\"ockner, Tim Warburton, Jeffrey Bridge, Jan S. Hesthaven

PDF

TL;DR

This paper demonstrates how Discontinuous Galerkin methods can be efficiently implemented on GPUs, achieving significant speedups and high computational throughput for solving PDEs like Maxwell's equations.

Contribution

The paper introduces GPU-optimized algorithms for DG methods, enabling high-performance, element-local computations that leverage GPU memory bandwidth and arithmetic intensity.

Findings

01

Achieved 40-60x speedup over CPU implementations.

02

Surpassed 200 gigaflops/s in practical computations.

03

Demonstrated effective GPU utilization for unstructured grid PDE solving.

Abstract

Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG's high-order nature lets it require fewer data points per represented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.