ENCODE: Encoding NetFlows for Network Anomaly Detection
Clinton Cao, Annibale Panichella, Sicco Verwer, Agathe Blaise, Filippo, Rebecchi

TL;DR
This paper introduces a novel encoding method for NetFlow data that emphasizes feature frequency and context, improving the performance of machine learning models in detecting network anomalies, especially in Kubernetes environments.
Contribution
The work presents a new encoding algorithm tailored for network data that enhances anomaly detection by considering feature frequency and context, unlike existing generic preprocessing methods.
Findings
Encoding improves anomaly detection accuracy.
Models trained on encoded data outperform baseline methods.
Effective on both Kubernetes and public NetFlow datasets.
Abstract
NetFlow data is a popular network log format used by many network analysts and researchers. The advantages of using NetFlow over deep packet inspection are that it is easier to collect and process, and it is less privacy intrusive. Many works have used machine learning to detect network attacks using NetFlow data. The first step for these machine learning pipelines is to pre-process the data before it is given to the machine learning algorithm. Many approaches exist to pre-process NetFlow data; however, these simply apply existing methods to the data, not considering the specific properties of network data. We argue that for data originating from software systems, such as NetFlow or software logs, similarities in frequency and contexts of feature values are more important than similarities in the value itself. In this work, we propose an encoding algorithm that directly takes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Software System Performance and Reliability · Anomaly Detection Techniques and Applications
