# From Data to Decision: A Multi-Stage Framework for Class Imbalance Mitigation in Optical Network Failure Analysis

**Authors:** Yousuf Moiz Ali, Jaroslaw E. Prilepsky, Nicola Sambo, Joao Pedro, Mohammad M. Hosseini, Antonio Napoli, Sergei K. Turitsyn, Pedro Freire

arXiv: 2509.00057 · 2025-09-03

## TL;DR

This paper compares pre-, in-, and post-processing techniques for addressing class imbalance in optical network failure analysis, highlighting the effectiveness of different methods depending on the scenario and constraints.

## Contribution

It provides a comprehensive experimental comparison of imbalance mitigation strategies, including novel insights into post-processing methods and the application of Generative AI in this context.

## Key findings

- Post-processing methods improve F1 scores up to 15.3%.
- GenAI methods enhance failure identification performance by up to 24.2%.
- Over-sampling with SMOTE is effective under latency constraints.

## Abstract

Machine learning-based failure management in optical networks has gained significant attention in recent years. However, severe class imbalance, where normal instances vastly outnumber failure cases, remains a considerable challenge. While pre- and in-processing techniques have been widely studied, post-processing methods are largely unexplored. In this work, we present a direct comparison of pre-, in-, and post-processing approaches for class imbalance mitigation in failure detection and identification using an experimental dataset. For failure detection, post-processing methods-particularly Threshold Adjustment-achieve the highest F1 score improvement (up to 15.3%), while Random Under-Sampling provides the fastest inference. In failure identification, GenAI methods deliver the most substantial performance gains (up to 24.2%), whereas post-processing shows limited impact in multi-class settings. When class overlap is present and latency is critical, over-sampling methods such as the SMOTE are most effective; without latency constraints, Meta-Learning yields the best results. In low-overlap scenarios, Generative AI approaches provide the highest performance with minimal inference time.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00057/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00057/full.md

## References

55 references — full list in the complete paper: https://tomesphere.com/paper/2509.00057/full.md

---
Source: https://tomesphere.com/paper/2509.00057