Leveraging Interpretability in the Transformer to Automate the Proactive Scaling of Cloud Resources
Amadou Ba, Pavithra Harsha, Chitra Subramanian

TL;DR
This paper presents an interpretable Transformer-based model that predicts microservice latency and automates resource scaling to meet SLAs, reducing operational costs and improving QoS in cloud-native environments.
Contribution
It introduces a novel approach combining Temporal Fusion Transformer interpretability with Kernel Ridge Regression for proactive resource scaling in microservices.
Findings
Effective latency prediction for microservices
Automated resource adjustment ensuring SLA compliance
Demonstrated approach on a real microservice application
Abstract
Modern web services adopt cloud-native principles to leverage the advantages of microservices. To consistently guarantee high Quality of Service (QoS) according to Service Level Agreements (SLAs), ensure satisfactory user experiences, and minimize operational costs, each microservice must be provisioned with the right amount of resources. However, accurately provisioning microservices with adequate resources is complex and depends on many factors, including workload intensity and the complex interconnections between microservices. To address this challenge, we develop a model that captures the relationship between an end-to-end latency, requests at the front-end level, and resource utilization. We then use the developed model to predict the end-to-end latency. Our solution leverages the Temporal Fusion Transformer (TFT), an attention-based architecture equipped with interpretability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability
Methodstravel james · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Linear Layer · Adam · Dropout · Layer Normalization · Dense Connections
