TL;DR
This paper introduces BGTD, a new benchmark combining raw bytes and annotations for explainable encrypted traffic analysis, and proposes mmTraffic, a multimodal reasoning framework that produces human-readable reports with high accuracy.
Contribution
It presents the first multimodal benchmark for encrypted traffic interpretation and a novel reasoning architecture that enhances explainability and accuracy.
Findings
mmTraffic generates high-fidelity, evidence-grounded traffic reports.
The framework maintains competitive classification accuracy.
BGTD enriches datasets with semantic annotations for better reasoning.
Abstract
Network traffic, as a key media format, is crucial for ensuring security and communications in modern internet infrastructure. While existing methods offer excellent performance, they face two key bottlenecks: (1) They fail to capture multidimensional semantics beyond unimodal sequence patterns. (2) Their black box property, i.e., providing only category labels, lacks an auditable reasoning process. We identify a key factor that existing network traffic datasets are primarily designed for classification and inherently lack rich semantic annotations, failing to generate human-readable evidence report. To address data scarcity, this paper proposes a Byte-Grounded Traffic Description (BGTD) benchmark for the first time, combining raw bytes with structured expert annotations. BGTD provides necessary behavioral features and verifiable chains of evidence for multimodal reasoning towards…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
