Evaluating Tabular Representation Learning for Network Intrusion Detection
Muhammad Usman Butt, Andreas Hotho, Daniel Schl\"or

TL;DR
This paper systematically evaluates how modern tabular representation learning methods perform in network intrusion detection, comparing them to traditional autoencoders and transformer baselines across multiple datasets.
Contribution
It provides a comprehensive benchmark of state-of-the-art representation learning techniques for NetFlow data, highlighting their dataset-dependent performance and transferability.
Findings
Supervised methods outperform unsupervised anomaly detection.
TabICL performs best on CIDDS dataset.
Representation transferability varies with dataset combinations.
Abstract
Classic Network Intrusion Detection Systems (NIDS) often rely on manual feature engineering to extract meaningful patterns from network traffic data. However, this approach requires domain expertise and runs counter to the widely adopted principle of modern machine learning and neural networks: that models themselves should learn meaningful representations directly from data. We investigate whether tabular representation learning techniques can improve intrusion detection performance by automatically learning robust feature representations for NetFlow data. This paper presents a systematic evaluation of state-of-the-art representation learning methods on benchmark NetFlow datasets, comparing against traditional autoencoders and end-to-end transformer baselines. We evaluate learned representations using both supervised classifiers and unsupervised anomaly detectors, with comprehensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
