Fast And Scalable FFT-Based GPU-Accelerated Algorithms for Block-Triangular Toeplitz Matrices With Application to Linear Inverse Problems Governed by Autonomous Dynamical Systems
Sreeram Venkat, Milinda Fernando, Stefan Henneking, Omar Ghattas

TL;DR
This paper introduces a GPU-accelerated, FFT-based algorithm for efficiently performing matrix-vector multiplications with block Toeplitz matrices, significantly speeding up large-scale inverse problems governed by autonomous PDE systems.
Contribution
The paper presents a novel scalable algorithm leveraging FFTs for block Toeplitz matrices, enabling fast Hessian matvecs in PDE-based inverse problems on multi-GPU systems.
Findings
Achieves over 80% peak bandwidth on NVIDIA A100 GPUs.
Demonstrates excellent weak scaling up to 48 GPUs.
Performs Hessian matvecs in fractions of a second, orders of magnitude faster than traditional methods.
Abstract
We present an efficient and scalable algorithm for performing matrix-vector multiplications ("matvecs") for block Toeplitz matrices. Such matrices, which are shift-invariant with respect to their blocks, arise in the context of solving inverse problems governed by autonomous systems, and time-invariant systems in particular. In this article, we consider inverse problems that infer unknown parameters from observational data of a linear time-invariant dynamical system given in the form of partial differential equations (PDEs). Matrix-free Newton-conjugate-gradient methods are often the gold standard for solving these inverse problems, but they require numerous actions of the Hessian on a vector. Matrix-free adjoint-based Hessian matvecs require solution of a pair of linearized forward/adjoint PDE solves per Hessian action, which may be prohibitive for large-scale inverse problems. Time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Computer Graphics and Visualization Techniques · Neural Networks and Applications
