Code Generation for Near-Roofline Finite Element Actions on GPUs from Symbolic Variational Forms
Kaushik Kulkarni, Andreas Kl\"ockner

TL;DR
This paper introduces a GPU parallelization strategy for finite element variational forms using code transformations and heuristic search, achieving near-roofline performance in FEM computations.
Contribution
It presents a novel code transformation and heuristic-based search approach for efficient FEM variational form evaluation on GPUs, integrated into the Firedrake framework.
Findings
Achieves over 50% roofline performance in 65% of test cases
Effective on multiple GPU architectures (Volta and Kepler)
Applicable to various FEM operators in 2D and 3D geometries
Abstract
We present a novel parallelization strategy for evaluating Finite Element Method (FEM) variational forms on GPUs, focusing on those that are expressible through the Unified Form Language (UFL) on simplex meshes. We base our approach on code transformations, wherein we construct a space of scheduling candidates and rank them via a heuristic cost model to effectively handle the large diversity of computational workloads that can be expressed in this way. We present a design of a search space to which the cost model is applied, along with an associated pruning strategy to limit the number of configurations that need to be empirically evaluated. The goal of our design is to strike a balance between the device's latency-hiding capabilities and the amount of state space, a key factor in attaining near-roofline performance. To make our work widely available, we have prototyped our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
