Optimal Hyper-Scalable Load Balancing with a Strict Queue Limit
Mark van der Boor, Sem Borst, Johan van Leeuwaarden

TL;DR
This paper investigates load balancing in large-scale systems under extremely limited communication and strict queue limits, proposing a universal throughput bound and a scheme that is proven to be throughput-optimal in such hyper-scalable regimes.
Contribution
It introduces a universal throughput upper bound for dispatcher algorithms under sparse communication and queue constraints, and presents a scheme that achieves this bound, proving throughput optimality.
Findings
Universal throughput bound established for sparse communication regimes.
Proposed scheme operates at any message rate and enforces queue limits.
Scheme proven to be throughput-optimal in many-server regimes.
Abstract
Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off between delay performance and implementation overhead (e.g. communication or memory usage). This trade-off has primarily been studied so far from the angle of the amount of overhead required to achieve asymptotically optimal performance, particularly vanishing delay in large-scale systems. In contrast, in the present paper, we focus on an arbitrarily sparse communication budget, possibly well below the minimum requirement for vanishing delay, referred to as the hyper-scalable operating region. Furthermore, jobs may only be admitted when a specific limit on the queue position of the job can be guaranteed. The centerpiece of our analysis is a universal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
