Generalized Encrypted Traffic Classification Using Inter-Flow Signals

Federica Bianchi; Edoardo Di Paolo; Angelo Spognardi

arXiv:2508.21558·cs.CR·September 1, 2025

Generalized Encrypted Traffic Classification Using Inter-Flow Signals

Federica Bianchi, Edoardo Di Paolo, Angelo Spognardi

PDF

TL;DR

This paper introduces a new encrypted traffic classification model that uses inter-flow signals to analyze raw PCAP data, achieving high accuracy and generalizability across multiple tasks and datasets.

Contribution

It proposes a novel inter-flow signal-based approach that operates directly on raw PCAP data, enhancing flexibility and performance over existing methods.

Findings

01

Achieves up to 99% accuracy in classification tasks

02

Outperforms existing methods in most datasets

03

Demonstrates robustness and adaptability across tasks

Abstract

In this paper, we present a novel encrypted traffic classification model that operates directly on raw PCAP data without requiring prior assumptions about traffic type. Unlike existing methods, it is generalizable across multiple classification tasks and leverages inter-flow signals - an innovative representation that captures temporal correlations and packet volume distributions across flows. Experimental results show that our model outperforms well-established methods in nearly every classification task and across most datasets, achieving up to 99% accuracy in some cases, demonstrating its robustness and adaptability.

Tables2

Table 1. Table 1: List of features extracted for incoming packets, outgoing packets, and their combined set.

	Features
Packet-Level
Mean Packet Size, Variance of Packet Size, Standard Deviation of Packet Size,
Mean Absolute Deviation of Packet Size, Skewness of Packet Size,
Kurtosis of Packet, Size Percentiles (10th to 90th)
Intra-Flow Level
Standard Deviation of Flow Duration, Mean Number of Packets per Flow

Table 2. Table 2: Results and comparison of our method with MAppGraph and Flowprint across different datasets.

Model	Our Method				MAppGraph				Flowprint
Dataset	Acc.	Prec.	Rec.	F1	Acc.	Prec.	Rec.	F1	Acc.	Prec.	Rec.	F1
MAppGraph	0.9517	0.9473	0.9385	0.9418	0.9346	0.9364	0.9346	0.9347	0.8664	0.8718	0.8664	0.8662
PostQuantumTLS	0.5013	0.5624	0.5013	0.4941	0.3684	0.2663	0.2760	0.2612	0.7077	0.7177	0.7077	0.6927
Cross Platform	0.3418	0.5088	0.3120	0.2748	0.2686	0.1853	0.2130	0.1862	0.8692	0.9070	0.8692	0.8745
ISCX-VPN	0.9903	0.9895	0.9882	0.9885	0.8846	0.6988	0.7195	0.7055	0.9484	0.9728	0.9484	0.9557
ISCX-nonVPN	0.9918	0.9887	0.9811	0.9845	0.9244	0.9022	0.8875	0.8915	0.7791	0.8770	0.7791	0.8150
CICAndMal2017	0.8308	0.8315	0.8285	0.8295	0.8518	0.7507	0.6467	0.6755	0.7836	0.7837	0.7836	0.7829
IoT-Sentinel	0.9280	0.9198	0.9293	0.8878	0.6553	0.6002	0.5451	0.5481	0.7084	0.7084	0.7084	0.7084
CSTNET-TLS 1.3	0.7497	0.7408	0.7079	0.7142	-	-	-	-	0.1953	0.1987	0.1953	0.1834

Equations18

⟨ src_{ip}, dst_{ip}, src_{port}, dst_{port}, proto ⟩,

⟨ src_{ip}, dst_{ip}, src_{port}, dst_{port}, proto ⟩,

C_{i} = {p_{j} \in P ∣ t_{j} \in [t_{start}, t_{start} + W]} .

C_{i} = {p_{j} \in P ∣ t_{j} \in [t_{start}, t_{start} + W]} .

p_{j} = (t_{j}, l_{j}),

p_{j} = (t_{j}, l_{j}),

F_{k} = {(t_{1}^{k}, l_{1}^{k}), (t_{2}^{k}, l_{2}^{k}), ..., (t_{n_{k}}^{k}, l_{n_{k}}^{k})} where t_{1}^{k} < t_{2}^{k} < \dots < t_{n_{k}}^{k} .

F_{k} = {(t_{1}^{k}, l_{1}^{k}), (t_{2}^{k}, l_{2}^{k}), ..., (t_{n_{k}}^{k}, l_{n_{k}}^{k})} where t_{1}^{k} < t_{2}^{k} < \dots < t_{n_{k}}^{k} .

t_{min} = k = 1, \dots, n min i = 1, \dots, n_{k} min t_{i}^{k}, t_{max} = k = 1, \dots, n max i = 1, \dots, n_{k} max t_{i}^{k} .

t_{min} = k = 1, \dots, n min i = 1, \dots, n_{k} min t_{i}^{k}, t_{max} = k = 1, \dots, n max i = 1, \dots, n_{k} max t_{i}^{k} .

T = {t_{min}, t_{min} + δ, ..., t_{max}} .

T = {t_{min}, t_{min} + δ, ..., t_{max}} .

A_{k} = j = 1 \sum n_{k} l_{j}^{k} .

A_{k} = j = 1 \sum n_{k} l_{j}^{k} .

S (T_{min} + δ t) = k = 1 \sum n A_{k} j = 1 \sum n_{k} l_{j}^{k} 1_{[T_{min} + δ t, T_{min} + δ (t + 1))} (t_{j}^{k}), t = 0, \dots, \frac{T _{ma x} - T _{min}}{δ} .

S (T_{min} + δ t) = k = 1 \sum n A_{k} j = 1 \sum n_{k} l_{j}^{k} 1_{[T_{min} + δ t, T_{min} + δ (t + 1))} (t_{j}^{k}), t = 0, \dots, \frac{T _{ma x} - T _{min}}{δ} .

1_{B} (x) = {10 if x \in B, otherwise.

1_{B} (x) = {10 if x \in B, otherwise.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: Computer Science Department, Sapienza University of Rome,

11email: [email protected], 11email: [email protected], 11email: [email protected]

Generalized Encrypted Traffic Classification Using Inter-Flow Signals††thanks: This work was supported by project SERICS (PE00000014) under the

NRRP MUR program funded by the EU-NGEU.

Federica Bianchi 11 0009-0006-2698-4484

Edoardo Di Paolo 11 0000-0001-9216-8430

Angelo Spognardi 11 0000-0001-6935-0701

Abstract

In this paper, we present a novel encrypted traffic classification model that operates directly on raw PCAP data without requiring prior assumptions about traffic type. Unlike existing methods, it is generalizable across multiple classification tasks and leverages inter-flow signals—an innovative representation that captures temporal correlations and packet volume distributions across flows. Experimental results show that our model outperforms well-established methods in nearly every classification task and across most datasets, achieving up to 99% accuracy in some cases, demonstrating its robustness and adaptability.

Keywords:

Encrypted Traffic Analysis Network Security Machine Learning.

1 Introduction

Network traffic analysis is a fundamental aspect of cybersecurity, network management, and performance optimization. Analyzing network activity provides valuable insights into communication patterns, the services and applications in use, and potential security threats. As a result, traffic analysis plays a key role in a wide range of real-world applications, from securing enterprise networks and detecting malware to enhancing service quality and protecting user privacy. In recent years, the increasing adoption of encryption protocols has rendered plaintext payload analysis ineffective, shifting the focus to methods based on machine learning and deep learning. Encrypted traffic analysis supports multiple critical goals [21], including the identification of network assets like IoT systems and mobile devices, enabling better asset management and vulnerability assessment. It also facilitates network characterization by evaluating service quality and user experience metrics. Moreover, it plays a key role in privacy leakage detection, revealing user activities and accessed apps, services, or websites through encrypted traffic. Finally, it is crucial for identifying malicious behavior and anomalies, a need underscored by the rising prevalence of malware and attacks on enterprise, IoT, and blockchain infrastructures.

Despite the broad applicability of encrypted traffic classification, most existing research typically focuses on developing a method tailored to a single task and is rarely evaluated in multiple analysis goals, limiting the generalizability. Moreover, previous work focuses mainly on statistical features extracted from packets or within individual flows (intra-flow level), with only a few exploring relationships among multiple flows over time (inter-flow level) and integrating insights from multiple perspectives [17, 25, 26].

To address these limitations, we present a novel encrypted traffic classification model that is both generalizable across multiple domains and does not require prior assumptions about traffic type. Our approach proves effective across various tasks, including mobile application and website fingerprinting, malware detection, IoT device fingerprinting, and general traffic classification (with and without VPN), making it a highly versatile solution. Although our model has been tested on the aforementioned tasks, it also holds significant potential for application in other encrypted traffic classification tasks due to its generalizability. Another key innovation of our method is the introduction of signals, a novel inter-flow representation that captures temporal correlations among flows and packet volume distributions. Drawing inspiration from telecommunications and signal processing [18], we treat network flows as temporal signals. Unlike statistical features, which summarize flow characteristics in isolation, signals model how multiple flows interact over time, offering a richer context-aware classification of encrypted traffic. To our knowledge, this is the first general-purpose encrypted traffic classifier using inter-flow signal representations.

The main contributions of this paper can be summarized as follows:

•

We propose a general-purpose encrypted traffic classification model that successfully generalizes across multiple analysis goals.

•

We introduce a novel inter-flow representation, called signals, which captures temporal correlations among flows and packet volume distributions.

•

We conduct an extensive evaluation across eight diverse datasets, covering multiple encrypted traffic classification tasks.

•

We compare our approach against well-established methods across the same datasets, demonstrating superior performance in all classification tasks.

2 Related Work

One of the main applications of encrypted network traffic analysis has been the identification of mobile applications. FlowPrint [25] analyzed temporal correlations among destination-related features of network traffic at the inter-flow level and created a database of application-specific fingerprints. When classifying new traffic, their method generated a corresponding fingerprint and matched it against the database. Taylor et al. in AppScanner [24] adopted machine learning models using statistical features at the intra-flow level and packet sizes to recognize apps. More recent methods leverage deep learning. Pham et al. [17] proposed MAppGraph, a method that classifies mobile applications by constructing a communication graph of network destinations for each app. The nodes of the graph for each app are defined by tuples of IP addresses and ports of the services connected by the app, while the edges represent weighted communication correlations among these nodes. They then used deep graph convolutional neural networks (DGCNN) to learn the various communication behaviors of mobile apps. Some works, instead, feed raw packet bytes directly into deep learning models, such as in PEAN [10].

Website fingerprinting aims to identify the specific website visited by a user despite encryption mechanisms. Many methods leverage machine learning to infer web activity [20, 22], while others leverage deep learning approaches, such as in [4, 27], in which they classify websites visited using CNNs.

Generic traffic classification targets broad service categories (e.g., email, VoIP). Chen et al. [3] employed Long Short-Term Memory (LSTM) on message size sequences for early prediction, while Luo et al. [13] proposed a self-supervised method for traffic classification to enhance flow representations using trace data.

Encrypted traffic analysis has also been widely used to distinguish malicious activity from benign traffic. The approaches range from traditional machine learning classifiers [6, 8] to deep models. Feng et al. [5] introduced a two-layer CNN-AutoEncoder to classify malware, while more recently Liu et al. [12] modeled temporal patterns in Dalvik opcode sequences with LSTM and temporal convolutional network (TCN). In IoT device fingerprinting, statistical profiling has been effective [23], while recent efforts employ deep learning for IoT malware detection [1].

3 Problem Definition

Encrypted network traffic classification categorizes network traffic according to different analysis goals. Indeed, traffic can be generated from various sources, such as mobile applications, web pages or websites, services, malware, etc.

Unlike approaches that classify individual traffic flows or packets, our method operates on traffic chunks. A traffic chunk is a bounded segment of network traffic that spans a specific time window, which may include multiple concurrent or sequential flows. Formally, let $P$ be the set of all the observed** network packets** in a monitored traffic session. Each packet $p_{j}\in P$ belongs to a specific network flow, which consists of a series of packets sharing the standard 5-tuple:

[TABLE]

where $\textit{src}_{\textit{ip}}$ and $\textit{dst}_{\textit{ip}}$ are the source and destination IP addresses, $\textit{src}_{\textit{port}}$ and $dst_{\textit{port}}$ are the source and destination ports, and proto is the transport-layer protocol. A traffic chunk $C_{i}$ is defined as a set of packets occurring within a fixed time window $W$ , such that:

[TABLE]

where $t_{j}$ is the timestamp of packet $p_{j}$ , $t_{\textit{start}}$ is the starting time of the chunk and $W$ is the pre-defined window duration.

A traffic chunk may contain different flow communication scenarios:

•

Single Flow Communication: Occurs when a network interaction involves one isolated flow between a client and server, exhibiting simple request-response behavior, without concurrent or sequential connections.

•

Sequential Flow Communication: Involves multiple flows initiated sequentially for different stages of the communication. Each flow contributes to a broader activity.

•

Concurrent Flow Communication: In many real-world scenarios, endpoints initiate multiple flows in parallel towards different network destinations to manage different aspects of the communication.

Since we evaluate our model on multiple classification tasks, the ground-truth label varies depending on the task. For mobile applications, it identifies the specific application in use. In website classification, it corresponds to the accessed site, whether through standard or VPN-protected browsing. For malware detection, the label distinguishes between benign and malicious traffic. In service classification, it indicates the type of network service (e.g., chat, email, VoIP, etc.). Finally, for IoT classification, it reflects the type of IoT device involved in the communication.

4 Methodology

We propose a general traffic classification method that operates directly on raw PCAP files, without assumptions about the nature of the traffic. This flexible design supports various analysis goals, including mobile application fingerprinting, website fingerprinting, generic traffic classification, malware detection, and IoT traffic classification. To achieve effective classification, we extract features at multiple representation levels. At the inter-flow level, we introduce signals, a key novelty of our work, which capture temporal relationships, volume, and timing across flows within the same time window. We also compute statistical features at the packet and intra-flow levels for detailed insights into flow behavior and packet distributions.

Traffic Pre-Processing. Following prior work [16, 17, 25], we filter out packets based on port numbers, discarding traffic from common background services (e.g., DNS, DHCP). These ports are typically not related to specific apps, services, or websites and do not offer distinguishing characteristics for classification [17, 27]. Removing them reduces irrelevant variation and improves the clarity of traffic features for classification. After PCAPs are filtered, we split them into smaller chunks, or time windows, of network activity, to improve computational efficiency and analytical precision. Segmenting traffic into manageable portions reduces the complexity and overhead associated with analyzing large captures, essential for real-time or high-volume scenarios. Additionally, breaking down traffic into time windows allows us to focus on discrete periods of network communication, enabling a finer-grained analysis of the traffic temporal dynamics, as network behavior can vary over time. To prevent information loss at the boundaries of these time windows, we introduce an overlap parameter, which defines the fraction of two consecutive chunks that overlap. Without overlap, some traffic patterns occurring near chunk edges may be split between consecutive windows, leading to fragmented representations. Choosing the time window duration and overlap involves trade-offs. Short windows capture fine-grained interactions but can be noisy and fragmented, while longer ones smooth out fluctuations and provide a broader perspective, but risk obscuring short-term behaviors and delaying classification. Similarly, large overlaps preserve continuity but add redundancy, whereas small overlaps are more efficient but may disrupt dependencies between related flows. We empirically tune these parameters for optimal balance.

Traffic Representation. To achieve a thorough understanding of traffic patterns, we represent them at multiple levels: packet level, intra-flow level, and inter-flow level. This choice comes from the intuition that traffic may leak different kinds of information based on the viewpoint of the analysis, each providing valuable insights into its behavior and the overall dynamics of the network. At the inter-flow level, we aim to leverage the temporal relationships between the sending and receiving of packets across multiple flows. The key hypothesis underlying this representation is that network communications generate discernible patterns in flow initiation and sequencing, which can be used for traffic classification. When analyzing the network behavior of communicating endpoints within specific time windows (i.e., chunks), we observe the scenarios outlined in Section 3, namely single flow communication, sequential flow communication, and concurrent flow communication. Traditional studies on encrypted traffic classification typically isolate individual flows for analysis. In contrast, our approach wants to capture these sequential and concurrent flow communication scenarios, examining how flows coexist and interact within the same temporal context. This provides a more holistic representation that integrates both the byte volume and the temporal relationships between flows, such as flows initiation, sequencing, and overlap. To capture these inter-flow dynamics, we represent network traffic as a discrete time series signal. This transformation moves beyond simple flow classification by analyzing traffic as a structured sequence that encapsulates both the exchanged data volume and the key temporal features of flow interactions. The general idea is that each chunk is transformed into a unified signal that aggregates all packets exchanged during that time window, regardless of whether they originate from a single flow or multiple concurrent flows. The rationale behind constructing a unified signal is to capture the endpoint global communication pattern, rather than focusing on individual flows.

With this unified signal, we aim to: (i) capture the volume of transmitted data across all flows within the chunk, along with the temporal structure of packet transmissions, revealing distinct traffic patterns; (ii) quantify each flow contribution to the overall network activity, helping to identify dominant interactions in data transfer; and (iii) analyze flow interactions to help distinguish different types of traffic. For example, mobile applications may exhibit consistent flow initiation and overlap patterns—such as connections to authentication servers, content delivery networks, or analytics services. This predictable behavior suggests that the initiation of flows follows a relatively consistent pattern in both timing and volume.

Signal Creation.

Each network packet is represented as a tuple containing its timestamp and packet length:

[TABLE]

where $t_{j}$ represents the time at which the packet was transmitted as recorded in the PCAP file (approximated in seconds for analysis) and $l_{j}$ denotes its size in bytes. A network flow $F_{k}$ is then defined as an ordered sequence of packets that share the same five-tuple identifier, as defined in Equation 1:

[TABLE]

Each flow consists of packets exchanged between a client and a server, and multiple flows may be present within a given traffic chunk, as described in the previous sections. To construct a single unified signal from multiple flows (sequential or concurrent), we aggregate their packet information over a coherent time axis to ensure that all flows are synchronized within the time window of the chunk, aligning their packet data into a cohesive representation.

Let $F=\{F_{1},F_{2},...,F_{n}\}$ represent the set of flows within a chunk. We determine the minimum and maximum timestamps across all flows:

[TABLE]

Then, we define $T$ as a discretization of the interval $t_{\text{min}}\text{ to }t_{\text{max}}$ with a step size $\delta$ , which can be chosen arbitrarily:

[TABLE]

For each flow $F_{k}$ , we compute its amplitude, which represents the total transmitted data volume, computed as follows:

[TABLE]

Each packet is then mapped onto the time axis, and the value of the signal at each timestamp is computed by summing the contributions of all active flows:

[TABLE]

where, given a set B,

[TABLE]

By scaling the signal values with amplitude, we ensure that each flow influence is proportional to its total traffic volume within the chunk. High-volume flows contribute more significantly, while low-volume flows have a lesser impact.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Ali, S., Abusabha, O., Ali, F., Imran, M., Abuhmed, T.: Effective multitask deep learning for iot malware detection and identification using behavioral traffic analysis. IEEE TNSM (2023). https://doi.org/10.1109/TNSM.2022.3200741
2[2] Biondi, P.: Scapy. [online] Available: http://www.secdev.org/projects/scapy
3[3] Chen, W., Lyu, F., Wu, F., Yang, P., Xue, G., Li, M.: Sequential message characterization for early classification of encrypted internet traffic. IEEE TVT (2021)
4[4] Cui, W., Chen, T., Chan-Tin, E.: More realistic website fingerprinting using deep learning. In: (ICDCS) (2020). https://doi.org/10.1109/ICDCS 47774.2020.00058
5[5] Feng, J., Shen, L., Chen, Z., Wang, Y., Li, H.: A two-layer deep learning method for android malware detection using network traffic. IEEE Access (2020). https://doi.org/10.1109/ACCESS.2020.3008081
6[6] Garg, S., Peddoju, S.K., Sarje, A.K.: Network-based detection of android malicious apps (2017). https://doi.org/10.1007/s 10207-016-0343-z
7[7] Habibi Lashkari, A., Draper Gil, G., Mamun, M., Ghorbani, A.: Characterization of encrypted and vpn traffic using time-related features (2016). https://doi.org/10.5220/0005740704070414
8[8] Lashkari, A.H., A.Kadir, A.F., Gonzalez, H., Mbah, K.F., A. Ghorbani, A.: Towards a network-based framework for android malware detection and characterization. In: PST (2017). https://doi.org/10.1109/PST.2017.00035