Privacy-preserving Machine Learning through Data Obfuscation
Tianwei Zhang, Zecheng He, Ruby B. Lee

TL;DR
This paper introduces a data obfuscation method that adds noise to training data, enabling privacy-preserving machine learning without significantly sacrificing model accuracy, and effectively defending against privacy attacks.
Contribution
A novel, generic data obfuscation technique that protects training data privacy while maintaining high model accuracy in machine learning applications.
Findings
Effective against four types of privacy attacks
Negligible impact on model accuracy
Applicable to various datasets and models
Abstract
As machine learning becomes a practice and commodity, numerous cloud-based services and frameworks are provided to help customers develop and deploy machine learning applications. While it is prevalent to outsource model training and serving tasks in the cloud, it is important to protect the privacy of sensitive samples in the training dataset and prevent information leakage to untrusted third parties. Past work have shown that a malicious machine learning service provider or end user can easily extract critical information about the training samples, from the model parameters or even just model outputs. In this paper, we propose a novel and generic methodology to preserve the privacy of training data in machine learning applications. Specifically we introduce an obfuscate function and apply it to the training data before feeding them to the model training task. This function adds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
