Leveraging LLMs for Structured Information Extraction and Analysis from Cloud Incident Reports (Work In Progress Paper)
Xiaoyu Chu, Shashikant Ilager, Yizhen Zang, Sacheendra Talluri, Alexandru Iosup

TL;DR
This paper investigates how large language models can be used to extract structured information from unstructured cloud incident reports, evaluating multiple models, prompts, and datasets to optimize accuracy and efficiency.
Contribution
It introduces a comprehensive methodology for using LLMs to extract key incident report data, comparing prompt strategies and models, and providing practical insights for improving incident report analysis.
Findings
LLMs achieve 75-95% accuracy in metadata extraction.
Few-shot prompting improves accuracy and reduces latency for most fields.
Lightweight models offer better trade-offs between cost, latency, and accuracy.
Abstract
Incident management is essential to maintain the reliability and availability of cloud computing services. Cloud vendors typically disclose incident reports to the public, summarizing the failures and recovery process to help minimize their impact. However, such reports are often lengthy and unstructured, making them difficult to understand, analyze, and use for long-term dependability improvements. The emergence of LLMs offers new opportunities to address this challenge, but how to achieve this is currently understudied. In this paper, we explore the use of cutting-edge LLMs to extract key information from unstructured cloud incident reports. First, we collect more than 3,000 incident reports from 3 leading cloud service providers (AWS, AZURE, and GCP), and manually annotate these collected samples. Then, we design and compare 6 prompt strategies to extract and classify different types…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsResearch Data Management Practices · Scientific Computing and Data Management · Software System Performance and Reliability
