Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling
S.R. Eshwar, Lucas Lopes Felipe, Alexandre Reiffers-Masson, Daniel, Sadoc Menasch\'e, Gugan Thoppe

TL;DR
This paper develops a new model and algorithms for load balancing and auto scaling in systems with bursty traffic, using weakly coupled MDPs and a two-timescale LP-based approach for online learning and policy optimization.
Contribution
It introduces a novel weakly coupled MDP model for load balancing and auto scaling, along with a tractable relaxed LP formulation and an online learning algorithm.
Findings
Effective LP relaxation for complex control problems
Successful online policy learning in dynamic environments
Improved scalability for load balancing and auto scaling
Abstract
Load balancing and auto scaling are at the core of scalable, contemporary systems, addressing dynamic resource allocation and service rate adjustments in response to workload changes. This paper introduces a novel model and algorithms for tuning load balancers coupled with auto scalers, considering bursty traffic arriving at finite queues. We begin by presenting the problem as a weakly coupled Markov Decision Processes (MDP), solvable via a linear program (LP). However, as the number of control variables of such LP grows combinatorially, we introduce a more tractable relaxed LP formulation, and extend it to tackle the problem of online parameter learning and policy optimization using a two-timescale algorithm based on the LP Lagrangian.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Age of Information Optimization · Distributed and Parallel Computing Systems
Methodstravel james
