Collective Autoscaling for Cloud Microservices
Vighnesh Sachidananda, Anirudh Sivaraman

TL;DR
COLA is a collective autoscaling system for microservices that optimizes VM allocation to minimize costs while maintaining end-to-end latency targets, outperforming existing autoscalers in various cloud environments.
Contribution
We introduce COLA, a novel autoscaler that globally allocates resources to microservices considering end-to-end latency, unlike independent microservice autoscaling methods.
Findings
COLA reduces costs by 19.3% on average compared to other autoscalers.
It achieves latency targets in 53 out of 63 workloads.
COLA is optimal or near-optimal in 90% of exhaustively tested cases.
Abstract
As cloud applications shift from monoliths to loosely coupled microservices, application developers must decide how many compute resources (e.g., number of replicated containers) to assign to each microservice within an application. This decision affects both (1) the dollar cost to the application developer and (2) the end-to-end latency perceived by the application user. Today, individual microservices are autoscaled independently by adding VMs whenever per-microservice CPU or memory utilization crosses a configurable threshold. However, an application user's end-to-end latency consists of time spent on multiple microservices and each microservice might need a different number of VMs to achieve an overall end-to-end latency. We present COLA, an autoscaler for microservice-based applications, which collectively allocates VMs to microservices with a global goal of minimizing dollar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · IoT and Edge/Fog Computing
