Generating Artificial Data for Private Deep Learning

Aleksei Triastcyn; Boi Faltings

arXiv:1803.03148·cs.LG·April 30, 2019·23 cites

Generating Artificial Data for Private Deep Learning

Aleksei Triastcyn, Boi Faltings

PDF

Open Access

TL;DR

This paper introduces a method using generative adversarial networks to create artificial data that preserves privacy and statistical properties of real data, enabling safe model training.

Contribution

It presents a novel approach combining GANs with an empirical privacy risk assessment to generate high-quality, privacy-preserving artificial data for deep learning.

Findings

01

Artificial data retains key statistical properties.

02

Models trained on artificial data perform well.

03

Privacy risk is effectively limited.

Abstract

In this paper, we propose generating artificial data that retain statistical properties of real data as the means of providing privacy with respect to the original dataset. We use generative adversarial network to draw privacy-preserving artificial data samples and derive an empirical method to assess the risk of information disclosure in a differential-privacy-like way. Our experiments show that we are able to generate artificial data of high quality and successfully train and validate machine learning models on this data while limiting potential privacy loss.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques