Generating Artificial Data for Private Deep Learning
Aleksei Triastcyn, Boi Faltings

TL;DR
This paper introduces a method using generative adversarial networks to create artificial data that preserves privacy and statistical properties of real data, enabling safe model training.
Contribution
It presents a novel approach combining GANs with an empirical privacy risk assessment to generate high-quality, privacy-preserving artificial data for deep learning.
Findings
Artificial data retains key statistical properties.
Models trained on artificial data perform well.
Privacy risk is effectively limited.
Abstract
In this paper, we propose generating artificial data that retain statistical properties of real data as the means of providing privacy with respect to the original dataset. We use generative adversarial network to draw privacy-preserving artificial data samples and derive an empirical method to assess the risk of information disclosure in a differential-privacy-like way. Our experiments show that we are able to generate artificial data of high quality and successfully train and validate machine learning models on this data while limiting potential privacy loss.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques
