Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification

Xiao Liu; Xiaowei Fu; Fuxiang Huang; Lei Zhang

arXiv:2603.29537·cs.CR·April 1, 2026

Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification

Xiao Liu, Xiaowei Fu, Fuxiang Huang, Lei Zhang

PDF

1 Repo

TL;DR

This paper introduces MMAE, a novel self-supervised pre-training framework for encrypted traffic classification that leverages flow mixing and packet importance to improve multi-granularity understanding and discriminative feature learning.

Contribution

The paper proposes MMAE, combining flow mixing, self-distillation, and packet importance masking to enhance encrypted traffic classification beyond isolated flow analysis.

Findings

01

MMAE achieves state-of-the-art results on multiple datasets.

02

FlowMix improves model robustness against distorted tokens.

03

Packet-importance masking enhances semantic understanding.

Abstract

Network traffic classification using self-supervised pre-training models based on Masked Autoencoders (MAE) has demonstrated a huge potential. However, existing methods are confined to isolated byte-level reconstruction of individual flows, lacking adequate perception of the multi-granularity contextual relationship in traffic. To address this limitation, we propose Mean MAE (MMAE), a teacher-student MAE paradigm with flow mixing strategy for building encrypted traffic pre-training model. MMAE employs a self-distillation mechanism for teacher-student interaction, where the teacher provides unmasked flow-level semantic supervision to advance the student from local byte reconstruction to multi-granularity comprehension. To break the information bottleneck in individual flows, we introduce a dynamic Flow Mixing (FlowMix) strategy to replace traditional random masking mechanism. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lx6c78/MMAE
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.