# pMR: A high-performance communication library

**Authors:** Peter Georg, Daniel Richtmann, Tilo Wettig

arXiv: 1701.08521 · 2017-01-31

## TL;DR

The paper introduces pMR, a lightweight high-performance communication library that significantly reduces communication time and total execution time in parallel applications, especially in lattice QCD computations.

## Contribution

pMR provides a drop-in replacement for MPI that reduces overhead and improves communication performance without requiring algorithmic changes.

## Key findings

- 2x reduction in communication time on realistic lattices
- Up to 20% total execution time savings
- Effective in the coarse-grid solve of DD-αAMG algorithm

## Abstract

On many parallel machines, the time LQCD applications spent in communication is a significant contribution to the total wall-clock time, especially in the strong-scaling limit. We present a novel high-performance communication library that can be used as a de facto drop-in replacement for MPI in existing software. Its lightweight nature that avoids some of the unnecessary overhead introduced by MPI allows us to improve the communication performance of applications without any algorithmic or complicated implementation changes. As a first real-world benchmark, we make use of the pMR library in the coarse-grid solve of the Regensburg implementation of the DD-$\alpha$AMG algorithm. On realistic lattices, we see an improvement of a factor 2x in pure communication time and total execution time savings of up to 20%.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1701.08521/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1701.08521/full.md

## References

5 references — full list in the complete paper: https://tomesphere.com/paper/1701.08521/full.md

---
Source: https://tomesphere.com/paper/1701.08521