RCA Copilot: Transforming Network Data into Actionable Insights via Large Language Models

Alexander Shan; Jasleen Kaur; Rahul Singh; Tarun Banka; Raj Yavatkar; T. Sridhar

arXiv:2507.03224·cs.NI·July 8, 2025

RCA Copilot: Transforming Network Data into Actionable Insights via Large Language Models

Alexander Shan, Jasleen Kaur, Rahul Singh, Tarun Banka, Raj Yavatkar, T. Sridhar

PDF

TL;DR

RCACopilot leverages large language models combined with statistical methods to automate root cause analysis in complex network environments, providing clear explanations and actionable insights to improve reliability and reduce manual effort.

Contribution

This paper introduces RCACopilot, a novel system that integrates LLM reasoning with statistical tests to automate and explain network root cause analysis.

Findings

01

RCACopilot achieves high accuracy in identifying network root causes.

02

The system provides clear explanations and actionable steps for engineers.

03

It demonstrates effectiveness across diverse network environments.

Abstract

Ensuring the reliability and availability of complex networked services demands effective root cause analysis (RCA) across cloud environments, data centers, and on-premises networks. Traditional RCA methods, which involve manual inspection of data sources such as logs and telemetry data, are often time-consuming and challenging for on-call engineers. While statistical inference methods have been employed to estimate the causality of network events, these approaches alone are similarly challenging and suffer from a lack of interpretability, making it difficult for engineers to understand the predictions made by black-box models. In this paper, we present RCACopilot, an advanced on-call system that combines statistical tests and large language model (LLM) reasoning to automate RCA across various network environments. RCACopilot gathers and synthesizes critical runtime diagnostic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.