Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4
Xuchao Zhang, Supriyo Ghosh, Chetan Bansal, Rujia Wang, Minghua Ma, Yu, Kang, Saravan Rajmohan

TL;DR
This paper presents an in-context learning approach using GPT-4 for automated root cause analysis in cloud incidents, outperforming fine-tuned models without requiring costly fine-tuning.
Contribution
The study introduces a novel in-context learning method for RCA that eliminates the need for fine-tuning GPT-4, reducing costs and maintaining high accuracy.
Findings
Outperforms fine-tuned GPT-3 by 24.8% across metrics
Achieves 49.7% improvement over zero-shot GPT-4
Human evaluation shows 43.5% better correctness
Abstract
Root Cause Analysis (RCA) plays a pivotal role in the incident diagnosis process for cloud services, requiring on-call engineers to identify the primary issues and implement corrective actions to prevent future recurrences. Improving the incident RCA process is vital for minimizing service downtime, customer impact and manual toil. Recent advances in artificial intelligence have introduced state-of-the-art Large Language Models (LLMs) like GPT-4, which have proven effective in tackling various AIOps problems, ranging from code authoring to incident management. Nonetheless, the GPT-4 model's immense size presents challenges when trying to fine-tune it on user data because of the significant GPU resource demand and the necessity for continuous model fine-tuning with the emergence of new data. To address the high cost of fine-tuning LLM, we propose an in-context learning approach for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Data Quality and Management · Software System Performance and Reliability
Methodstravel james · Refunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Discriminative Fine-Tuning · Cosine Annealing · Byte Pair Encoding · Adam · Label Smoothing
