FIRM: An Intelligent Fine-Grained Resource Management Framework for SLO-Oriented Microservices
Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk,, Ravishankar K. Iyer

TL;DR
FIRM is an intelligent resource management framework that uses machine learning and telemetry data to dynamically allocate resources in microservices, significantly reducing SLO violations and tail latencies.
Contribution
FIRM introduces a novel adaptive framework that detects, localizes, and mitigates resource contention in microservices using online data and machine learning techniques.
Findings
Reduces SLO violations by up to 16x
Decreases overall CPU requests by up to 62%
Reduces tail latencies by up to 11x
Abstract
Modern user-facing latency-sensitive web services include numerous distributed, intercommunicating microservices that promise to simplify software development and operation. However, multiplexing of compute resources across microservices is still challenging in production because contention for shared resources can cause latency spikes that violate the service-level objectives (SLOs) of user requests. This paper presents FIRM, an intelligent fine-grained resource management framework for predictable sharing of resources across microservices to drive up overall utilization. FIRM leverages online telemetry data and machine-learning methods to adaptively (a) detect/localize microservices that cause SLO violations, (b) identify low-level resources in contention, and (c) take actions to mitigate SLO violations via dynamic reprovisioning. Experiments across four microservice benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · IoT and Edge/Fog Computing
