TN-AutoRCA: Benchmark Construction and Agentic Framework for Self-Improving Alarm-Based Root Cause Analysis in Telecommunication Networks

Keyu Wu; Qianjin Yu; Manlin Mei; Ruiting Liu; Jun Wang; Kailai Zhang; Yelun Bao

arXiv:2507.18190·cs.CL·July 29, 2025

TN-AutoRCA: Benchmark Construction and Agentic Framework for Self-Improving Alarm-Based Root Cause Analysis in Telecommunication Networks

Keyu Wu, Qianjin Yu, Manlin Mei, Ruiting Liu, Jun Wang, Kailai Zhang, Yelun Bao

PDF

Open Access 3 Reviews

TL;DR

This paper introduces TN-AutoRCA, a benchmark and framework for self-improving alarm-based root cause analysis in telecom networks, addressing AI challenges in complex graph reasoning and benchmark scarcity.

Contribution

It presents a novel benchmark and agentic framework for self-improving RCA, enhancing AI capabilities in complex telecommunication network analysis.

Findings

01

Developed a new benchmark for RCA in telecom networks

02

Proposed an agentic framework for self-improvement in RCA

03

Demonstrated improved accuracy in root cause identification

Abstract

Root Cause Analysis (RCA) in telecommunication networks is a critical task, yet it presents a formidable challenge for Artificial Intelligence (AI) due to its complex, graph-based reasoning requirements and the scarcity of realistic benchmarks.

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 5

Strengths

1. New Benchmark: TN-RCA is the benchmark for RCA in telecom networks, addressing a gap in the field. It is built on real-world data and features a principled difficulty stratification method via cycle consistency checks. 2. Agentic Framework Design: Auto-RCA introduces a modular, feedback-driven agent architecture that systematically improves code-based solutions.

Weaknesses

1. Limited Motivational Depth: While RCA is an important industrial problem, the paper does not sufficiently justify why it is a meaningful or challenging AI research task. The connection to broader machine learning challenges—beyond telecom-specific applications—is underdeveloped, making the contribution feel more applied than foundational. 2. Superficial Experimental Analysis: The experiments, though comprehensive in model comparisons, lack in-depth analysis. For instance, there is little di

Reviewer 02Rating 2Confidence 4

Strengths

1. The benchmark and experiments are grounded in real-world telecommunication scenarios and data. 2. The paper demonstrates that a code-based solution can be effective for RCA tasks, achieving strong results on OpenRCA.

Weaknesses

1. Novelty and relevance: The Auto-RCA framework appears applicable to general code-based problem-solving tasks; it is unclear which design elements are specifically tailored to the telecommunication RCA context. 2. Missing statistics: The paper lacks details about the training set’s size and difficulty distribution, which are crucial given that Auto-RCA’s core mechanism relies on contrastive feedback. 3. Missing case analysis: The paper is missing failure cases of LLM-only methods and pre-/post

Reviewer 03Rating 2Confidence 3

Strengths

S1: The proposed benchmark is the first RCA benchmark in telecommunication networks.

Weaknesses

W1: The paper lacks focus. It proposed a new benchmark TN-RCA, which should be challenging. And then it proposed a new method Auto-RCA, which achieves a 91% F1-score. It seems that the new benchmark is not challenging enough. W2: For the proposed benchmark TN-RCA, the authors only evaluated LLM-based methods, but ignored the traditional RCA methods. RCA for graph-structured data has been extensively studied in the microservice domain [1][2][3]. It is not clear how they perform on these benchmark

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Network Security and Intrusion Detection · Smart Grid Security and Resilience