DD-$\alpha$AMG on QPACE 3
Peter Georg, Daniel Richtmann, Tilo Wettig

TL;DR
This paper details the porting and performance evaluation of the DD-$\alpha$AMG solver from QPACE 2 to QPACE 3, highlighting code adaptation, communication library changes, and scalability results.
Contribution
It provides a comprehensive account of adapting the DD-$\alpha$AMG solver to new hardware and communication infrastructure, demonstrating near-ideal scalability and performance.
Findings
Speedup close to theoretical expectations
Successful porting from Knights Corner to Knights Landing
Effective scaling on multiple nodes
Abstract
We describe our experience porting the Regensburg implementation of the DD-AMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present the performance of the code on a single processor as well as the scaling on many nodes, where in both cases the speedup factor is close to the theoretical expectations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
