ORACL: Optimized Reasoning for Autoscaling via Chain of Thought with LLMs for Microservices
Haoyu Bai (1), Muhammed Tawfiqul Islam (1), Minxian Xu (2), Rajkumar Buyya (1) ((1) Quantum Cloud Computing, Distributed Systems (qCLOUDS) Lab, School of Computing, Information Systems, The University of Melbourne, Australia, (2) Shenzhen Institute of Advanced Technology

TL;DR
ORACL uses large language models with chain-of-thought reasoning to diagnose performance issues and optimize resource allocation in microservice architectures, achieving better accuracy and efficiency without retraining.
Contribution
The paper introduces ORACL, a novel LLM-based framework that leverages semantic reasoning for autoscaling in microservices, eliminating the need for extensive retraining.
Findings
Root-cause identification accuracy improved by 15%
Training time reduced by up to 24x
Quality of service increased by 6% in short-term scenarios
Abstract
Applications are moving away from monolithic designs to microservice and serverless architectures, where fleets of lightweight and independently deployable components run on public clouds. Autoscaling serves as the primary control mechanism for balancing resource utilization and quality of service, yet existing policies are either opaque learned models that require substantial per-deployment training or brittle hand-tuned rules that fail to generalize. We investigate whether large language models can act as universal few-shot resource allocators that adapt across rapidly evolving microservice deployments. We propose ORACL, Optimized Reasoning for Autoscaling via Chain of Thought with LLMs for Microservices, a framework that leverages prior knowledge and chain-of-thought reasoning to diagnose performance regressions and recommend resource allocations. ORACL transforms runtime…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Software-Defined Networks and 5G
