A GPU-Accelerated Fast Summation Method Based on Barycentric Lagrange   Interpolation and Dual Tree Traversal

Leighton Wilson; Nathan Vaughn; and Robert Krasny

arXiv:2012.06925·physics.comp-ph·June 2, 2021

A GPU-Accelerated Fast Summation Method Based on Barycentric Lagrange Interpolation and Dual Tree Traversal

Leighton Wilson, Nathan Vaughn, and Robert Krasny

PDF

TL;DR

This paper introduces a GPU-accelerated fast summation method called BLDTT, which uses barycentric Lagrange interpolation and dual tree traversal to efficiently compute particle interactions with linear scaling on large problems.

Contribution

The paper presents a novel, kernel-independent GPU implementation of the BLDTT that achieves linear scaling and outperforms previous methods like BLTC in particle interaction computations.

Findings

01

BLDTT achieves O(N) scaling on large particle problems.

02

GPU implementation demonstrates high performance and scalability.

03

Comparison shows BLDTT outperforms earlier treecode methods.

Abstract

We present the barycentric Lagrange dual tree traversal (BLDTT) fast summation method for particle interactions. The scheme replaces well-separated particle-particle interactions by adaptively chosen particle-cluster, cluster-particle, and cluster-cluster approximations given by barycentric Lagrange interpolation at proxy particles on a Chebyshev grid in each cluster. The BLDTT is kernel-independent and the approximations can be efficiently mapped onto GPUs, where target particles provide an outer level of parallelism and source particles provide an inner level of parallelism. We present an OpenACC GPU implementation of the BLDTT with MPI remote memory access for distributed memory parallelization. The performance of the GPU-accelerated BLDTT is demonstrated for calculations with different problem sizes, particle distributions, geometric domains, and interaction kernels, as well as for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.