# Manycore parallel computing for a hybridizable discontinuous Galerkin   nested multigrid method

**Authors:** M. S. Fabien, M. G. Knepley, R. T. Mills, and B. M. Riviere

arXiv: 1705.09907 · 2019-07-18

## TL;DR

This paper develops a parallel hybridizable discontinuous Galerkin nested multigrid solver optimized for many-core processors, achieving high efficiency and convergence rates for high polynomial orders in computational PDEs.

## Contribution

It introduces a fine-grain parallelization strategy for HDG nested multigrid methods on many-core architectures, utilizing matrix-free techniques and data locality for high performance.

## Key findings

- Achieves ideal convergence rates of 0.2 or less for high polynomial orders.
- Attains 80% of peak bandwidth performance with high-order polynomials.
- Demonstrates high efficiency and scalability on Intel Xeon Phi (Knights Landing) processors.

## Abstract

We present a parallel computing strategy for a hybridizable discontinuous Galerkin (HDG) nested geometric multigrid (GMG) solver. Parallel GMG solvers require a combination of coarse-grain and fine-grain parallelism to improve time to solution performance. In this work we focus on fine-grain parallelism. We use Intel's second generation Xeon Phi (Knights Landing) many-core processor. The GMG method achieves ideal convergence rates of $0.2$ or less, for high polynomial orders. A matrix free (assembly free) technique is exploited to save considerable memory usage and increase arithmetic intensity. HDG enables static condensation, and due to the discontinuous nature of the discretization, we developed a matrix vector multiply routine that does not require any costly synchronizations or barriers. Our algorithm is able to attain 80\% of peak bandwidth performance for higher order polynomials. This is possible due to the data locality inherent in the HDG method. Very high performance is realized for high order schemes, due to good arithmetic intensity, which declines as the order is reduced.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.09907/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/1705.09907/full.md

## References

58 references — full list in the complete paper: https://tomesphere.com/paper/1705.09907/full.md

---
Source: https://tomesphere.com/paper/1705.09907