Deep Learning for Network Anomaly Detection under Data Contamination:   Evaluating Robustness and Mitigating Performance Degradation

D'Jeff K. Nkashama; Jordan Masakuna F\'elicien; Arian Soltani,; Jean-Charles Verdier; Pierre-Martin Tardif; Marc Frappier; Froduald Kabanza

arXiv:2407.08838·cs.LG·September 16, 2024

Deep Learning for Network Anomaly Detection under Data Contamination: Evaluating Robustness and Mitigating Performance Degradation

D'Jeff K. Nkashama, Jordan Masakuna F\'elicien, Arian Soltani,, Jean-Charles Verdier, Pierre-Martin Tardif, Marc Frappier, Froduald Kabanza

PDF

Open Access

TL;DR

This paper evaluates the robustness of deep learning models for network anomaly detection against data contamination and proposes an auto-encoder based mitigation method to improve their resilience.

Contribution

It introduces an evaluation protocol for robustness of DL-based NAD models and proposes a constrained auto-encoder to mitigate contamination effects.

Findings

01

State-of-the-art algorithms degrade significantly under data contamination.

02

The proposed auto-encoder improves resistance to contaminated data.

03

Enhanced models maintain better detection performance with contaminated data.

Abstract

Deep learning (DL) has emerged as a crucial tool in network anomaly detection (NAD) for cybersecurity. While DL models for anomaly detection excel at extracting features and learning patterns from data, they are vulnerable to data contamination -- the inadvertent inclusion of attack-related data in training sets presumed benign. This study evaluates the robustness of six unsupervised DL algorithms against data contamination using our proposed evaluation protocol. Results demonstrate significant performance degradation in state-of-the-art anomaly detection algorithms when exposed to contaminated data, highlighting the critical need for self-protection mechanisms in DL-based NAD models. To mitigate this vulnerability, we propose an enhanced auto-encoder with a constrained latent representation, allowing normal data to cluster more densely around a learnable center in the latent space. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Adversarial Robustness in Machine Learning