On the Privacy Effect of Data Enhancement via the Lens of Memorization

Xiao Li; Qiongxiu Li; Zhanhao Hu; Xiaolin Hu

arXiv:2208.08270·cs.LG·March 26, 2024·1 cites

On the Privacy Effect of Data Enhancement via the Lens of Memorization

Xiao Li, Qiongxiu Li, Zhanhao Hu, Xiaolin Hu

PDF

Open Access 1 Repo

TL;DR

This paper investigates the privacy implications of data enhancement in machine learning through memorization analysis, revealing that traditional privacy metrics may be misleading and that privacy, generalization, and robustness are interconnected in complex ways.

Contribution

It introduces a memorization-based perspective for privacy evaluation, challenging existing metrics, and explores the relationships among privacy, generalization, and robustness.

Findings

01

Memorization-based attack captures individual privacy risks more accurately.

02

Privacy leakage is less correlated with generalization gap than previously thought.

03

No inherent trade-off between adversarial robustness and privacy.

Abstract

Machine learning poses severe privacy concerns as it has been shown that the learned models can reveal sensitive information about their training data. Many works have investigated the effect of widely adopted data augmentation and adversarial training techniques, termed data enhancement in the paper, on the privacy leakage of machine learning models. Such privacy effects are often measured by membership inference attacks (MIAs), which aim to identify whether a particular example belongs to the training set or not. We propose to investigate privacy from a new perspective called memorization. Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks as members compared to samples with low privacy risks. To solve this problem, we deploy a recent attack that can capture individual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lixiaothu/privacy_and_aug
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Privacy-Preserving Technologies in Data