# Predict-and-recompute conjugate gradient variants

**Authors:** Tyler Chen, Erin C. Carson

arXiv: 1905.01549 · 2021-04-20

## TL;DR

This paper introduces conjugate gradient variants that reduce communication bottlenecks on parallel architectures by overlapping global synchronizations and using a predict-and-recompute scheme, maintaining convergence quality.

## Contribution

The paper presents new conjugate gradient variants that decrease iteration runtime through overlapping synchronizations and a predict-and-recompute approach, with proven convergence behavior.

## Key findings

- Variants reduce runtime per iteration in practice.
- Variants scale similarly to communication-hiding methods.
- Convergence behavior remains nearly as good as standard CG.

## Abstract

The standard implementation of the conjugate gradient algorithm suffers from communication bottlenecks on parallel architectures, due primarily to the two global reductions required every iteration. In this paper, we study conjugate gradient variants which decrease the runtime per iteration by overlapping global synchronizations, and in the case of pipelined variants, matrix-vector products. Through the use of a predict-and-recompute scheme, whereby recursively-updated quantities are first used as a predictor for their true values and then recomputed exactly at a later point in the iteration, these variants are observed to have convergence behavior nearly as good as the standard conjugate gradient implementation on a variety of test problems. We provide a rounding error analysis which provides insight into this observation. It is also verified experimentally that the variants studied do indeed reduce the runtime per iteration in practice and that they scale similarly to previously-studied communication-hiding variants. Finally, because these variants achieve good convergence without the use of any additional input parameters, they have the potential to be used in place of the standard conjugate gradient implementation in a range of applications.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01549/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01549/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1905.01549/full.md

---
Source: https://tomesphere.com/paper/1905.01549