A GPU-Accelerated Fast Summation Method Based on Barycentric Lagrange Interpolation and Dual Tree Traversal
Leighton Wilson, Nathan Vaughn, and Robert Krasny

TL;DR
This paper introduces a GPU-accelerated fast summation method called BLDTT, which uses barycentric Lagrange interpolation and dual tree traversal to efficiently compute particle interactions with linear scaling on large problems.
Contribution
The paper presents a novel, kernel-independent GPU implementation of the BLDTT that achieves linear scaling and outperforms previous methods like BLTC in particle interaction computations.
Findings
BLDTT achieves O(N) scaling on large particle problems.
GPU implementation demonstrates high performance and scalability.
Comparison shows BLDTT outperforms earlier treecode methods.
Abstract
We present the barycentric Lagrange dual tree traversal (BLDTT) fast summation method for particle interactions. The scheme replaces well-separated particle-particle interactions by adaptively chosen particle-cluster, cluster-particle, and cluster-cluster approximations given by barycentric Lagrange interpolation at proxy particles on a Chebyshev grid in each cluster. The BLDTT is kernel-independent and the approximations can be efficiently mapped onto GPUs, where target particles provide an outer level of parallelism and source particles provide an inner level of parallelism. We present an OpenACC GPU implementation of the BLDTT with MPI remote memory access for distributed memory parallelization. The performance of the GPU-accelerated BLDTT is demonstrated for calculations with different problem sizes, particle distributions, geometric domains, and interaction kernels, as well as for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
