Finite Element Integration with Quadrature on the GPU
Matthew G. Knepley, Karl Rupp, Andy R. Terrel

TL;DR
This paper introduces a GPU-optimized quadrature-based finite element integration method that achieves high computational throughput for low-order elements, significantly improving efficiency in PDE solvers.
Contribution
The authors develop a novel thread transposition pattern for finite element integration on GPUs, enabling high vectorization and avoiding reductions, with demonstrated near-peak performance.
Findings
Achieves 300 GF/s in 2D and 400 GF/s in 3D for Laplacian integration
Performance closely matches the bandwidth-limited model predictions
Effective for vector-valued PDEs like linear elasticity
Abstract
We present a novel, quadrature-based finite element integration method for low-order elements on GPUs, using a pattern we call \textit{thread transposition} to avoid reductions while vectorizing aggressively. On the NVIDIA GTX580, which has a nominal single precision peak flop rate of 1.5 TF/s and a memory bandwidth of 192 GB/s, we achieve close to 300 GF/s for element integration on first-order discretization of the Laplacian operator with variable coefficients in two dimensions, and over 400 GF/s in three dimensions. From our performance model we find that this corresponds to 90\% of our measured achievable bandwidth peak of 310 GF/s. Further experimental results also match the predicted performance when used with double precision (120 GF/s in two dimensions, 150 GF/s in three dimensions). Results obtained for the linear elasticity equations (220 GF/s and 70 GF/s in two dimensions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods in engineering · Advanced Numerical Methods in Computational Mathematics · Fluid Dynamics Simulations and Interactions
