Mean-field analysis of load balancing principles in large scale systems
Ill\'es Horv\'ath, M\'arton M\'esz\'aros

TL;DR
This paper develops a mathematical framework to analyze and compare different load balancing strategies in large-scale heterogeneous server systems using mean-field theory, focusing on their transient and stationary performance.
Contribution
It introduces a high-level mean-field analysis framework for heterogeneous server clusters, enabling the study of various load balancing principles and their performance metrics.
Findings
Derived transient and stationary mean-field limits for load balancing policies
Compared performance measures such as job system time distributions
Provided insights into the effectiveness of different load balancing strategies
Abstract
Load balancing plays a crucial role in many large scale systems. Several different load balancing principles have been proposed in the literature, such as Join-Shortest-Queue (JSQ) and its variations, or Join-Below-Threshold. We provide a high level mathematical framework to examine heterogeneous server clusters in the mean-field limit as the system load and the number of servers scale proportionally. We aim to identify both the transient mean-field limit and the stationary mean-field limit for various choices of load balancing principles, compute relevant performance measures such as the distribution and mean of the system time of jobs, and conduct a comparison from a performance point of view.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
