Private data sharing between decentralized users through the privGAN architecture
Jean-Francois Rajotte, Raymond T Ng

TL;DR
This paper introduces a privacy-preserving data sharing method using privGAN architecture, enabling data owners to generate synthetic data without revealing actual data or model parameters, suitable for federated learning.
Contribution
The paper proposes a novel privGAN-based approach for decentralized data sharing that maintains privacy while improving data utility for machine learning tasks.
Findings
Synthetic data quality surpasses small real datasets.
Shared discriminator updates preserve privacy against white-box attacks.
Method is applicable in federated learning scenarios.
Abstract
More data is almost always beneficial for analysis and machine learning tasks. In many realistic situations however, an enterprise cannot share its data, either to keep a competitive advantage or to protect the privacy of the data sources, the enterprise's clients for example. We propose a method for data owners to share synthetic or fake versions of their data without sharing the actual data, nor the parameters of models that have direct access to the data. The method proposed is based on the privGAN architecture where local GANs are trained on their respective data subsets with an extra penalty from a central discriminator aiming to discriminate the origin of a given fake sample. We demonstrate that this approach, when applied to subsets of various sizes, leads to better utility for the owners than the utility from their real small datasets. The only shared pieces of information are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
