Asynchronous Load Balancing and Auto-scaling: Mean-Field Limit and Optimal Design
Jonatha Anselmi

TL;DR
This paper develops a Markovian framework for asynchronous load balancing and auto-scaling inspired by serverless platforms, optimizing delay and energy efficiency in large-scale systems.
Contribution
It introduces a novel asynchronous auto-scaling design with a mean-field analysis, providing conditions for optimal delay and energy performance in large networks.
Findings
Proposes a general condition for optimal scaling rules.
Designs a family of rules satisfying the optimality condition.
Numerical results show improved delay with comparable energy use.
Abstract
We develop a Markovian framework for load balancing that combines classical algorithms such as Power-of- with auto-scaling mechanisms that allow the net service capacity to scale up or down in response to the current load on the same timescale as job dynamics. Our framework is inspired by serverless platforms, such as Knative, where servers are software functions that can be flexibly instantiated in milliseconds according to scaling rules defined by the users of the serverless platform. The main question is how to design such scaling rules to minimize user-perceived delay performance while ensuring low energy consumption. For the first time, we investigate this problem when the auto-scaling and load balancing processes operate asynchronously (or proactively), as in Knative. In contrast to the synchronous (or reactive) paradigm, asynchronism brings the advantage that jobs do not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
