Towards a Data Privacy-Predictive Performance Trade-off
T\^ania Carvalho, Nuno Moniz, Pedro Faria, Lu\'is Antunes

TL;DR
This paper investigates the trade-off between data privacy and predictive performance in machine learning classification tasks, demonstrating that increased privacy protection generally reduces model accuracy.
Contribution
It provides a comprehensive evaluation of various privacy-preserving techniques, confirming the existence of a clear trade-off between privacy levels and predictive performance.
Findings
Higher privacy levels lead to increased re-identification resistance.
Enhanced privacy measures negatively impact classification accuracy.
Trade-off is consistent across multiple privacy techniques and algorithms.
Abstract
Machine learning is increasingly used in the most diverse applications and domains, whether in healthcare, to predict pathologies, or in the financial sector to detect fraud. One of the linchpins for efficiency and accuracy in machine learning is data utility. However, when it contains personal information, full access may be restricted due to laws and regulations aiming to protect individuals' privacy. Therefore, data owners must ensure that any data shared guarantees such privacy. Removal or transformation of private information (de-identification) are among the most common techniques. Intuitively, one can anticipate that reducing detail or distorting information would result in losses for model predictive performance. However, previous work concerning classification tasks using de-identified data generally demonstrates that predictive performance can be preserved in specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning and Data Classification · Data Quality and Management
