Leveraging Interpretability in the Transformer to Automate the Proactive   Scaling of Cloud Resources

Amadou Ba; Pavithra Harsha; Chitra Subramanian

arXiv:2409.03103·cs.LG·September 6, 2024

Leveraging Interpretability in the Transformer to Automate the Proactive Scaling of Cloud Resources

Amadou Ba, Pavithra Harsha, Chitra Subramanian

PDF

Open Access

TL;DR

This paper presents an interpretable Transformer-based model that predicts microservice latency and automates resource scaling to meet SLAs, reducing operational costs and improving QoS in cloud-native environments.

Contribution

It introduces a novel approach combining Temporal Fusion Transformer interpretability with Kernel Ridge Regression for proactive resource scaling in microservices.

Findings

01

Effective latency prediction for microservices

02

Automated resource adjustment ensuring SLA compliance

03

Demonstrated approach on a real microservice application

Abstract

Modern web services adopt cloud-native principles to leverage the advantages of microservices. To consistently guarantee high Quality of Service (QoS) according to Service Level Agreements (SLAs), ensure satisfactory user experiences, and minimize operational costs, each microservice must be provisioned with the right amount of resources. However, accurately provisioning microservices with adequate resources is complex and depends on many factors, including workload intensity and the complex interconnections between microservices. To address this challenge, we develop a model that captures the relationship between an end-to-end latency, requests at the front-end level, and resource utilization. We then use the developed model to predict the end-to-end latency. Our solution leverages the Temporal Fusion Transformer (TFT), an attention-based architecture equipped with interpretability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability

Methodstravel james · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Linear Layer · Adam · Dropout · Layer Normalization · Dense Connections