Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers
Alexander Stolyar

TL;DR
This paper analyzes a multi-router pull-based load balancing algorithm in large heterogeneous server pools, proving its asymptotic optimality and low communication overhead under certain conditions.
Contribution
It extends the PULL algorithm to multiple routers and proves its asymptotic optimality in large-scale systems with minimal message exchange.
Findings
Asymptotic optimality of PULL in multi-router systems
Vanishing customer blocking/waiting probability as system scales
Low message exchange rate of one message per customer
Abstract
The model is a service system, consisting of several large server pools. A server processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the customer total arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter , while the number of routers remains fixed. We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in [11] for the single-router model. Under PULL algorithm, when a server becomes idle it send a "pull-message" to a randomly uniformly selected router; each router operates independently -- it assigns an arriving customer to a server according to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
