Parallel tree algorithms for AMR and non-standard data access
Carsten Burstedde

TL;DR
This paper presents novel parallel algorithms for adaptive quadtrees/octrees tailored for complex data layouts in large-scale scientific applications, demonstrating scalability on supercomputers with billions of elements and particles.
Contribution
The paper introduces new parallel algorithms for adaptive forest data structures that support complex data access patterns and variable process counts, enhancing scalability and flexibility.
Findings
Algorithms scale to 21 billion particles on Juqueen supercomputer.
Parallel assembly of 768 billion elements achieved on Juwels supercomputer.
Demonstrated improved data handling for complex scientific simulations.
Abstract
We introduce several parallel algorithms operating on a distributed forest of adaptive quadtrees/octrees. They are targeted at large-scale applications relying on data layouts that are more complex than required for standard finite elements, such as hp-adaptive Galerkin methods, particle tracking and semi-Lagrangian schemes, and in-situ post-processing and visualization. Specifically, we design algorithms to derive an adapted worker forest based on sparse data, to identify owner processes in a top-down search of remote objects, and to allow for variable process counts and per-element data sizes in partitioning and parallel file I/O. We demonstrate the algorithms' usability and performance in the context of a particle tracking example that we scale to 21e9 particles and 64Ki MPI processes on the Juqueen supercomputer, and we describe the construction of a parallel assembly of variably…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
