Synthesizing Diverse Network Flow Datasets with Scalable Dynamic Multigraph Generation
Arya Grayeli, Vipin Swarup, Steven E. Noel

TL;DR
This paper presents a scalable machine learning approach for generating high-fidelity synthetic network flow datasets using dynamic multigraphs, improving accuracy and diversity over previous methods.
Contribution
Introduces a novel scalable model combining stochastic Kronecker graphs, GANs, and XGBoost for realistic synthetic network datasets with new evaluation metrics.
Findings
Outperforms previous graph generation methods in accuracy
Maintains high diversity in synthetic datasets
Provides new metrics for evaluating graph generative models
Abstract
Obtaining real-world network datasets is often challenging because of privacy, security, and computational constraints. In the absence of such datasets, graph generative models become essential tools for creating synthetic datasets. In this paper, we introduce a novel machine learning model for generating high-fidelity synthetic network flow datasets that are representative of real-world networks. Our approach involves the generation of dynamic multigraphs using a stochastic Kronecker graph generator for structure generation and a tabular generative adversarial network for feature generation. We further employ an XGBoost (eXtreme Gradient Boosting) model for graph alignment, ensuring accurate overlay of features onto the generated graph structure. We evaluate our model using new metrics that assess both the accuracy and diversity of the synthetic graphs. Our results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Generative Adversarial Networks and Image Synthesis · Graph Theory and Algorithms
