Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods
Mark van der Boor, Sem C. Borst, Johan S.H. van Leeuwaarden, and, Debankur Mukherjee

TL;DR
This paper analyzes scalable load balancing algorithms in large systems, showing that near-optimal performance can be achieved with minimal communication overhead by adjusting the growth rate of the sampling parameter.
Contribution
It demonstrates that the asymptotic optimality of JSQ(d) policies is maintained with a sufficiently fast growth of d(N), reducing communication costs.
Findings
JSQ(d(N)) policies achieve near-optimal performance with appropriate growth of d(N)
Performance insensitivity to the exact growth rate of d(N) as long as it is sufficiently fast
Join-the-Idle-Queue scheme further reduces communication overhead using dispatcher memory
Abstract
We present an overview of scalable load balancing algorithms which provide favorable delay performance in large-scale systems, and yet only require minimal implementation overhead. Aimed at a broad audience, the paper starts with an introduction to the basic load balancing scenario, consisting of a single dispatcher where tasks arrive that must immediately be forwarded to one of single-server queues. A popular class of load balancing algorithms are so-called power-of- or JSQ() policies, where an incoming task is assigned to a server with the shortest queue among servers selected uniformly at random. This class includes the Join-the-Shortest-Queue (JSQ) policy as a special case (), which has strong stochastic optimality properties and yields a mean waiting time that vanishes as grows large for any fixed subcritical load. However, a nominal implementation of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
