Multigrid methods for the Stokes problem on GPU systems
Cu Cui, Guido Kanschat

TL;DR
This paper introduces a high-performance, matrix-free multigrid method for the Stokes problem on GPUs, utilizing tensor product structures and optimized memory access to achieve unprecedented computational speeds.
Contribution
It develops a novel multigrid solver that operates directly on velocity and pressure spaces without global Schur complement approximation, optimized for GPU architectures.
Findings
Achieves over one billion degrees of freedom per second on NVIDIA A100 GPU.
Demonstrates efficiency comparable to 3D Poisson problem solvers.
Operates effectively on large-scale Stokes problems with high accuracy.
Abstract
This paper presents a matrix-free multigrid method for solving the Stokes problem, discretized using -conforming discontinuous Galerkin methods. We employ a Schur complement method combined with the fast diagonalization method for the efficient evaluation of the local solver within the multiplicative Schwarz smoother. This approach operates directly on both the velocity and pressure spaces, eliminating the need for a global Schur complement approximation. By leveraging the tensor product structure of Raviart-Thomas elements and an optimized, conflict-free shared memory access pattern, the matrix-free operator evaluation demonstrates excellent performance numbers, reaching over one billion degrees of freedom per second on a single NVIDIA A100 GPU. Numerical results indicate efficiency comparable to that of the three-dimensional Poisson problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
