TL;DR
This paper investigates using GANs to generate synthetic networked time series data, addressing challenges in fidelity and privacy, and introduces a custom workflow called DoppelGANger that improves data fidelity.
Contribution
The paper presents DoppelGANger, a novel GAN workflow tailored for time series data, demonstrating significant fidelity improvements over baseline models.
Findings
DoppelGANger achieves up to 43% better fidelity than baseline models.
Key challenges include long-term dependencies, complex relationships, and privacy concerns.
Fundamental privacy challenges remain unresolved, with a suggested roadmap for future work.
Abstract
Limited data access is a longstanding barrier to data-driven research and development in the networked systems community. In this work, we explore if and how generative adversarial networks (GANs) can be used to incentivize data sharing by enabling a generic framework for sharing synthetic datasets with minimal expert knowledge. As a specific target, our focus in this paper is on time series datasets with metadata (e.g., packet loss rate measurements with corresponding ISPs). We identify key challenges of existing GAN approaches for such workloads with respect to fidelity (e.g., long-term dependencies, complex multidimensional relationships, mode collapse) and privacy (i.e., existing guarantees are poorly understood and can sacrifice fidelity). To improve fidelity, we design a custom workflow called DoppelGANger (DG) and demonstrate that across diverse real-world datasets (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
