Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning
Artur Filipowicz, Thee Chanyaswad, S. Y. Kung

TL;DR
This paper proposes a privacy-preserving method using RDCA to desensitize data, effectively reducing privacy risks with minimal impact on utility across multiple datasets.
Contribution
It introduces a novel application of RDCA for data desensitization in machine learning, demonstrating effective privacy protection with low utility loss.
Findings
Privacy accuracy drops to near random levels
Utility accuracy decreases by around 5-8%
Method is effective across multiple datasets
Abstract
The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting
