MambaNetBurst: Direct Byte-level Network Traffic Classification without Tokenization or Pretraining
Gayan K. Kulatilleke, Siamak Layeghy, Mahsa Baktashmotlagh, Marius Portmann

TL;DR
MambaNetBurst is a compact, tokenizer-free byte-level network traffic classifier that operates directly on raw packet bytes, achieving strong results without pre-training or tokenization.
Contribution
It introduces a novel byte-level classification method using Mamba-2 backbone, eliminating the need for tokenization and pre-training, and demonstrating effectiveness across multiple benchmarks.
Findings
Outperforms or matches heavier, pre-trained models on six benchmarks.
Preserving byte-level temporal resolution is critical for accuracy.
Moderate state sizes suffice for robust generalization.
Abstract
We present MambaNetBurst, a compact tokenizer-free byte-level sequence classifier for network burst classification based on a Mamba-2 backbone. In contrast to most recent strong traffic-classification and intrusion-detection approaches, our method operates directly on raw packet bytes, avoids tokenization, patching, and heavy engineered multimodal representations, and does not require any self-supervised pre-training stage. Given a packet flow, we form a fixed-length burst from the first few packets, embed the resulting byte sequence appending a learnable CLS token, and process it with a stack of residual pre-normalized Mamba-2 blocks for end-to-end supervised classification. Across six public benchmarks spanning encrypted mobile app identification, VPN/Tor traffic classification, malware traffic classification, and IoT attack traffic, MambaNetBurst achieves consistently strong results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
