TL;DR
This paper introduces an Architectural Selection Framework that empirically evaluates and balances the fidelity and utility of synthetic network traffic across different generative models and data structures, aiding security practitioners.
Contribution
It systematically assesses multiple generative architectures using structural metrics, exposing their strengths and failure modes, and provides practical guidance for scalable synthetic data deployment.
Findings
GAN-based models like CTGAN and CopulaGAN show superior robustness.
Structural fidelity is crucial for downstream utility.
Diffusion Models face computational barriers for large-scale use.
Abstract
The fidelity and utility of synthetic network traffic are critically compromised by architectural mismatch across heterogeneous network datasets and prevalent scalability failure. This study addresses this challenge by establishing an Architectural Selection Framework that empirically quantifies how data structure compatibility dictates the optimal fidelity-utility trade-off. We systematically evaluate twelve generative architectures (both non-AI and AI) across two distinct data structure types: categorical-heavy NSL-KDD and continuous-flow-heavy CIC-IDS2017. Fidelity is rigorously assessed through three structural metrics (Data Structure, Correlation, and Probability Distribution Difference) to confirm structural realism before evaluating downstream utility. Our results, confirmed over twenty independent runs (N=20), demonstrate that GAN-based models (CTGAN, CopulaGAN) exhibit superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
