Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave   Scattering

Mustafa Abduljabbar; Mohammed Al Farhan; Noha Al-Harthi; Rui Chen; Rio; Yokota; Hakan Bagci; and David Keyes

arXiv:1803.09948·cs.PF·March 28, 2018

Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave Scattering

Mustafa Abduljabbar, Mohammed Al Farhan, Noha Al-Harthi, Rui Chen, Rio, Yokota, Hakan Bagci, and David Keyes

PDF

1 Repo

TL;DR

This paper presents an optimized, large-scale FMM-accelerated boundary integral equation solver for wave scattering, demonstrating high efficiency and scalability on exascale HPC architectures with significant performance gains.

Contribution

It introduces a highly optimized, architecture-aware FMM-based solver for wave scattering that scales efficiently on exascale supercomputers, utilizing advanced parallelism and SIMD optimizations.

Findings

01

Achieves 77% of peak performance on Skylake processors.

02

Demonstrates near-linear weak scalability up to 6,144 nodes.

03

Computes over 2 billion degrees of freedom on Cray XC40.

Abstract

Algorithmic and architecture-oriented optimizations are essential for achieving performance worthy of anticipated energy-austere exascale systems. In this paper, we present an extreme scale FMM-accelerated boundary integral equation solver for wave scattering, which uses FMM as a matrix-vector multiplication inside the GMRES iterative method. Our FMM Helmholtz kernels treat nontrivial singular and near-field integration points. We implement highly optimized kernels for both shared and distributed memory, targeting emerging Intel extreme performance HPC architectures. We extract the potential thread- and data-level parallelism of the key Helmholtz kernels of FMM. Our application code is well optimized to exploit the AVX-512 SIMD units of Intel Skylake and Knights Landing architectures. We provide different performance models for tuning the task-based tree traversal implementation of FMM,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ecrc/BEMFMM
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.