Work Stealing with latency

Mohammed Khatiri; Denis Trystram; Frederic Wagner

arXiv:1805.01768·cs.DC·May 7, 2018

Work Stealing with latency

Mohammed Khatiri, Denis Trystram, Frederic Wagner

PDF

Open Access

TL;DR

This paper analyzes how communication latency affects the efficiency of the Work Stealing load balancing algorithm, providing a new predictive model for expected running time and optimal processor usage.

Contribution

It introduces a latency parameter into existing performance models and derives a new expression for expected running time, aiding in optimal processor allocation.

Findings

01

Latency significantly impacts load balancing performance.

02

The new model accurately predicts acceptable performance conditions.

03

Simulation validates the model across various parameters.

Abstract

We study in this paper the impact of communication latency on the classical Work Stealing load balancing algorithm. Our approach considers existing performance models and the underlying algorithms. We introduce a latency parameter in the model and study its overall impact by careful observations of simulation results. Using this method we are able to derive a new expression of the expected running time of divisible load applications. This expression enables us to predict under which conditions a given run will yield acceptable performance. For instance, we can easily calibrate the maximal number of processors one should use for a given work platform combination. We also consider the impact of several algorithmic variants like simultaneous transfers of work or thresholds for avoiding useless transfers. All our results are validated through simulation on a wide range of parameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Interconnection Networks and Systems