HydraGAN A Multi-head, Multi-objective Approach to Synthetic Data   Generation

Chance N DeSmet; Diane J Cook

arXiv:2111.07015·cs.LG·November 16, 2021

HydraGAN A Multi-head, Multi-objective Approach to Synthetic Data Generation

Chance N DeSmet, Diane J Cook

PDF

Open Access

TL;DR

HydraGAN is a multi-agent, multi-objective generative adversarial network designed to produce synthetic data that balances realism, privacy, and accuracy, outperforming baseline methods across multiple datasets.

Contribution

Introduces HydraGAN, a novel multi-agent, multi-objective GAN framework that optimizes multiple criteria simultaneously for synthetic data generation.

Findings

01

Outperforms baseline methods on three datasets.

02

Balances data realism, model accuracy, and privacy.

03

Provides equilibrium guarantees through game-theoretic principles.

Abstract

Synthetic data generation overcomes limitations of real-world machine learning. Traditional methods are valuable for augmenting costly datasets but only optimize one criterion: realism. In this paper, we tackle the problem of generating synthetic data that optimize multiple criteria. This goal is necessary when real data are replaced by synthetic for privacy preservation. We introduce HydraGAN, a new approach to synthetic data generation that introduces multiple generator and discriminator agents into the system. The multi-agent GAN optimizes the goal of privacy-preservation as well as data realism. To facilitate multi-agent training, we adapt game-theoretic principles to offer equilibrium guarantees. We observe that HydraGAN outperforms baseline methods for three datasets for multiple criteria of maximizing data realism, maximizing model accuracy, and minimizing re-identification risk.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning and Data Classification · Mobile Crowdsensing and Crowdsourcing