A new tool for the performance analysis of massively parallel computer systems
Anton Stefanek (Imperial College London), Richard Hayden (Imperial, College London), Jeremy Bradley (Imperial College London)

TL;DR
This paper introduces GPA, a scalable differential equation-based tool for performance analysis of large parallel systems, capable of higher moment analysis and demonstrating improved accuracy with scale.
Contribution
The paper presents GPA, a novel performance analysis tool that uses ODEs to efficiently analyze large systems and produce higher moments, including variance.
Findings
GPA effectively generates key performance measures for large systems.
Higher moment analysis from ODEs improves with system scale.
Theoretical justification shows variance approximation tends to actual variance as scale increases.
Abstract
We present a new tool, GPA, that can generate key performance measures for very large systems. Based on solving systems of ordinary differential equations (ODEs), this method of performance analysis is far more scalable than stochastic simulation. The GPA tool is the first to produce higher moment analysis from differential equation approximation, which is essential, in many cases, to obtain an accurate performance prediction. We identify so-called switch points as the source of error in the ODE approximation. We investigate the switch point behaviour in several large models and observe that as the scale of the model is increased, in general the ODE performance prediction improves in accuracy. In the case of the variance measure, we are able to justify theoretically that in the limit of model scale, the ODE approximation can be expected to tend to the actual variance of the model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
