Privacy-preserving Machine Learning through Data Obfuscation

Tianwei Zhang; Zecheng He; Ruby B. Lee

arXiv:1807.01860·cs.CR·July 16, 2018·66 cites

Privacy-preserving Machine Learning through Data Obfuscation

Tianwei Zhang, Zecheng He, Ruby B. Lee

PDF

Open Access

TL;DR

This paper introduces a data obfuscation method that adds noise to training data, enabling privacy-preserving machine learning without significantly sacrificing model accuracy, and effectively defending against privacy attacks.

Contribution

A novel, generic data obfuscation technique that protects training data privacy while maintaining high model accuracy in machine learning applications.

Findings

01

Effective against four types of privacy attacks

02

Negligible impact on model accuracy

03

Applicable to various datasets and models

Abstract

As machine learning becomes a practice and commodity, numerous cloud-based services and frameworks are provided to help customers develop and deploy machine learning applications. While it is prevalent to outsource model training and serving tasks in the cloud, it is important to protect the privacy of sensitive samples in the training dataset and prevent information leakage to untrusted third parties. Past work have shown that a malicious machine learning service provider or end user can easily extract critical information about the training samples, from the model parameters or even just model outputs. In this paper, we propose a novel and generic methodology to preserve the privacy of training data in machine learning applications. Specifically we introduce an obfuscate function and apply it to the training data before feeding them to the model training task. This function adds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications