Multiple right-hand-side setup for the DD-\alpha AMG
Daniel Richtmann, Simon Heybrock, Tilo Wettig

TL;DR
This paper introduces an improved DD-lpha AMG setup method that processes multiple right-hand sides simultaneously, significantly reducing setup time and enhancing performance on modern many-core architectures.
Contribution
The paper presents a novel multi right-hand-side setup implementation for DD-lpha AMG, improving efficiency and scalability on architectures like Intel Xeon Phi.
Findings
Achieved approximately 3x speedup over single right-hand-side setup.
Enhanced network bandwidth utilization through larger message sizes.
Reduced synchronization overhead on many-core architectures.
Abstract
The setup cost of a modern solver such as DD-\alpha AMG (Wuppertal Multigrid) is a significant contribution to the total time spent on solving the Dirac equation, and in HMC it can even be dominant. We present an improved implementation of this algorithm with modified computation order in the setup procedure. By processing multiple right-hand sides simultaneously we can alleviate many of the performance issues of the default single right-hand-side setup. The main improvements are as follows: By combining multiple right-hand sides the message size for off-chip communication is larger, which leads to better utilization of the network bandwidth. Many matrix-vector products are replaced by matrix-matrix products, leading to better cache reuse. The synchronization overhead inflicted by on-chip parallelization (threading), which is becoming crucial on many-core architectures such as the Intel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Interconnection Networks and Systems
