Understanding performance variability in standard and pipelined parallel   Krylov solvers

Hannah Morgan; Patrick Sanan; Matthew G. Knepley; Richard; Tran Mills

arXiv:2103.12067·cs.MS·March 24, 2021·Int. J. High Perform. Comput. Appl.

Understanding performance variability in standard and pipelined parallel Krylov solvers

Hannah Morgan, Patrick Sanan, Matthew G. Knepley, Richard, Tran Mills

PDF

TL;DR

This paper investigates the performance variability of Krylov solvers caused by machine noise, demonstrating that pipelined algorithms reduce variability and proposing an improved non-stationary performance model for better prediction.

Contribution

It introduces an enhanced nondeterministic performance model accounting for iteration fluctuations, supported by extensive empirical data across multiple platforms.

Findings

01

Large variability in Krylov iterations across nodes for standard methods

02

Pipelined algorithms significantly reduce performance variability

03

The updated model accurately predicts observed performance fluctuations

Abstract

In this work, we collect data from runs of Krylov subspace methods and pipelined Krylov algorithms in an effort to understand and model the impact of machine noise and other sources of variability on performance. We find large variability of Krylov iterations between compute nodes for standard methods that is reduced in pipelined algorithms, directly supporting conjecture, as well as large variation between statistical distributions of runtimes across iterations. Based on these results, we improve upon a previously introduced nondeterministic performance model by allowing iterations to fluctuate over time. We present our data from runs of various Krylov algorithms across multiple platforms as well as our updated non-stationary model that provides good agreement with observations. We also suggest how it can be used as a predictive tool.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.