On the Privacy Effect of Data Enhancement via the Lens of Memorization
Xiao Li, Qiongxiu Li, Zhanhao Hu, Xiaolin Hu

TL;DR
This paper investigates the privacy implications of data enhancement in machine learning through memorization analysis, revealing that traditional privacy metrics may be misleading and that privacy, generalization, and robustness are interconnected in complex ways.
Contribution
It introduces a memorization-based perspective for privacy evaluation, challenging existing metrics, and explores the relationships among privacy, generalization, and robustness.
Findings
Memorization-based attack captures individual privacy risks more accurately.
Privacy leakage is less correlated with generalization gap than previously thought.
No inherent trade-off between adversarial robustness and privacy.
Abstract
Machine learning poses severe privacy concerns as it has been shown that the learned models can reveal sensitive information about their training data. Many works have investigated the effect of widely adopted data augmentation and adversarial training techniques, termed data enhancement in the paper, on the privacy leakage of machine learning models. Such privacy effects are often measured by membership inference attacks (MIAs), which aim to identify whether a particular example belongs to the training set or not. We propose to investigate privacy from a new perspective called memorization. Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks as members compared to samples with low privacy risks. To solve this problem, we deploy a recent attack that can capture individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Privacy-Preserving Technologies in Data
