Fast Stencil Computations using Fast Fourier Transforms
Zafar Ahmad, Rezaul Chowdhury, Rathish Das, Pramod Ganapathi, Aaron, Gregory, Yimin Zhu

TL;DR
This paper introduces novel algorithms using fast Fourier transforms to perform linear stencil computations more efficiently, significantly reducing computational work and runtime compared to existing methods, for both periodic and aperiodic boundary conditions.
Contribution
The paper presents the first algorithms that leverage FFTs to compute stencil evolutions in a single step, achieving sublinear work and better performance bounds than prior approaches.
Findings
Algorithms perform o(NT) work, improving over Θ(NT) of previous methods.
Experimental results show orders of magnitude speedup for large grids and timesteps.
Effective for both periodic and aperiodic boundary conditions.
Abstract
Stencil computations are widely used to simulate the change of state of physical systems across a multidimensional grid over multiple timesteps. The state-of-the-art techniques in this area fall into three groups: cache-aware tiled looping algorithms, cache-oblivious divide-and-conquer trapezoidal algorithms, and Krylov subspace methods. In this paper, we present two efficient parallel algorithms for performing linear stencil computations. Current direct solvers in this domain are computationally inefficient, and Krylov methods require manual labor and mathematical training. We solve these problems for linear stencils by using DFT preconditioning on a Krylov method to achieve a direct solver which is both fast and general. Indeed, while all currently available algorithms for solving general linear stencils perform work, where is the size of the spatial grid and is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Parallel Computing and Optimization Techniques · Electromagnetic Scattering and Analysis
