Non-Stationary Gradient Descent for Optimal Auto-Scaling in Serverless Platforms
Jonatha Anselmi, Bruno Gaujal, Louis-Sebastien Rebuffi

TL;DR
This paper introduces a stochastic gradient descent method for optimal auto-scaling in serverless platforms, addressing non-stationary conditions and providing convergence guarantees, with validation through simulations.
Contribution
It develops a novel non-stationary stochastic optimization approach for auto-scaling, with proven asymptotic optimality and convergence rate, tailored for transient system behaviors.
Findings
The proposed method achieves asymptotic optimality almost surely.
Convergence rate of the scheme is $O(n^{-2/3})$.
Numerical simulations show improved performance over existing rules.
Abstract
To efficiently manage serverless computing platforms, a key aspect is the auto-scaling of services, i.e., the set of computational resources allocated to a service adapts over time as a function of the traffic demand. The objective is to find a compromise between user-perceived performance and energy consumption. In this paper, we consider the \emph{scale-per-request} auto-scaling pattern and investigate how many function instances (or servers) should be spawned each time an \emph{unfortunate} job arrives, i.e., a job that finds all servers busy upon its arrival. We address this problem by following a stochastic optimization approach: we develop a stochastic gradient descent scheme of the Kiefer--Wolfowitz type that applies \emph{over a single run of the state evolution}. At each iteration, the proposed scheme computes an estimate of the number of servers to spawn each time an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Agent-Based Network Management · Cloud Computing and Resource Management
