# FlowletFormer: Network Behavioral Semantic Aware Pre-training Model for Traffic Classification

**Authors:** Liming Liu, Ruoyu Li, Qing Li, Meijia Hou, Yong Jiang, Mingwei Xu

arXiv: 2508.19924 · 2025-08-28

## TL;DR

FlowletFormer is a BERT-based pre-training model that improves network traffic classification by capturing hierarchical protocol semantics and inter-packet relationships, leading to higher accuracy and better few-shot learning.

## Contribution

The paper introduces FlowletFormer, a novel pre-training model with domain-specific modules for enhanced traffic representation and semantic understanding in network analysis.

## Key findings

- Outperforms existing methods in classification accuracy
- Enhances few-shot learning capabilities
- Better understanding of network transmission principles

## Abstract

Network traffic classification using pre-training models has shown promising results, but existing methods struggle to capture packet structural characteristics, flow-level behaviors, hierarchical protocol semantics, and inter-packet contextual relationships. To address these challenges, we propose FlowletFormer, a BERT-based pre-training model specifically designed for network traffic analysis. FlowletFormer introduces a Coherent Behavior-Aware Traffic Representation Model for segmenting traffic into semantically meaningful units, a Protocol Stack Alignment-Based Embedding Layer to capture multilayer protocol semantics, and Field-Specific and Context-Aware Pretraining Tasks to enhance both inter-packet and inter-flow learning. Experimental results demonstrate that FlowletFormer significantly outperforms existing methods in the effectiveness of traffic representation, classification accuracy, and few-shot learning capability. Moreover, by effectively integrating domain-specific network knowledge, FlowletFormer shows better comprehension of the principles of network transmission (e.g., stateful connections of TCP), providing a more robust and trustworthy framework for traffic analysis.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.19924/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/2508.19924/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/2508.19924/full.md

---
Source: https://tomesphere.com/paper/2508.19924