Spurious Privacy Leakage in Neural Networks

Chenxiang Zhang; Jun Pang; Sjouke Mauw

arXiv:2505.20095·cs.LG·October 7, 2025

Spurious Privacy Leakage in Neural Networks

Chenxiang Zhang, Jun Pang, Sjouke Mauw

PDF

Open Access 3 Reviews

TL;DR

This paper explores how spurious correlations in neural networks lead to privacy vulnerabilities, revealing that models often memorize spurious features and that robust methods do not mitigate privacy disparities across groups.

Contribution

It introduces the concept of spurious privacy leakage, analyzes why robust methods fail to reduce privacy disparity, and compares how different architectures influence privacy in the presence of spurious data.

Findings

01

Spurious groups are more vulnerable to privacy attacks than non-spurious groups.

02

Privacy disparity increases in tasks with fewer classes due to spurious features.

03

Robust methods do not prevent memorization of spurious features during training.

Abstract

Neural networks trained on real-world data often exhibit biases while simultaneously being vulnerable to privacy attacks aimed at extracting sensitive information. Despite extensive research on each problem individually, their intersection remains poorly understood. In this work, we investigate the privacy impact of spurious correlation bias. We introduce \emph{spurious privacy leakage}, a phenomenon in which spurious groups are significantly more vulnerable to privacy attacks than non-spurious groups. We observe that privacy disparity between groups increases in tasks with simpler objectives (e.g. fewer classes) due to spurious features. Counterintuitively, we demonstrate that spurious robust methods, designed to reduce spurious bias, fail to mitigate privacy disparity. Our analysis reveals that this occurs because robust methods can reduce reliance on spurious features for prediction,…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 4

Strengths

1. The paper attempts to undermine the relationship between bias an d privacy leakage. The key findings are well articulated via examperiments. 2. The paper is overall well written and easy to follow.

Weaknesses

1. The motivation is not very clear. What is the reason to assess privacy disparities between spurious and non-spurious subgroups? 2. The related work section is too brief considering the multiple topics the paper covers. Expanding it to include more of the relevant literature would strengthen the foundation. It is also unclear how the findings align with current research. 3. The experiments are not clearly explained. Including the choice of datasets and neural network models.

Reviewer 02Rating 8Confidence 5

Strengths

1. The paper presents highly comprehensive experiments that are quite well done. The results are well-presented, and well-supported by experimental evidence.

Weaknesses

1. I would urge the authors to make some of the details a little bit more transparent in the main body. One difference between the standard setting and this work is that MIAs are used for fine-tuning and not pre-training data. This may mean that the fine-tuned datasets are very small per model. One of the best-kept secrets in LIRA-style membership inference is that the MIA is always carried on models that are trained on only a subset of the data, and making that subset bigger leads to worse "pri

Reviewer 03Rating 5Confidence 4

Strengths

- The paper focuses on the connection between spurious correlations and privacy leakage, an underdeveloped topic in trustworthy machine learning. - The paper presents several interesting observations. - The paper is well-organized and easy to follow.

Weaknesses

1. The motivation behind evaluating privacy disparities among subgroups (spurious vs. non-spurious groups) is unclear. While the paper shows that existing methods (DRO, DFR, or DP-SGD) may not fully address these privacy gaps, it's unclear why privacy parity across subgroups could be a priority. Why should we care about these gaps? 2. While the authors present technically interesting observations about privacy disparities, the results are more like experimental reports. It’s unclear how these f

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning