A parallel workload has extreme variability in a production environment
R. Henwood, N. W. Watkins, S. C. Chapman, R. McLay

TL;DR
This paper models the extreme variability in parallel data writing workloads in production environments using the GEV distribution, incorporating traffic congestion effects, and analyzes workload behavior under high parallelism.
Contribution
It extends the GEV model for parallel workload variability by including traffic congestion and provides empirical analysis in HPC environments.
Findings
Workload variability tends towards GEV distribution with increased parallelism.
Traffic congestion significantly impacts workload duration variability.
The model offers insights for optimizing machine design under variable workloads.
Abstract
Writing data in parallel is a common operation in some computing environments and a good proxy for a number of other parallel processing patterns. The duration of time taken to write data in large-scale compute environments can vary considerably. This variation comes from a number of sources, both systematic and transient. The result is a highly complex behavior that is difficult to characterize. This paper further develops the model for parallel task variability proposed in the paper "A parallel workload has extreme variability" (Henwood et. al 2016). This model is the Generalized Extreme Value (GEV) distribution. This paper further develops the systematic analysis that leads to the GEV model with the addition of a traffic congestion term. Observations of a parallel workload are presented from a High Performance Computing environment under typical production conditions, which include…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydrology and Drought Analysis
