Provable Privacy Attacks on Trained Shallow Neural Networks

Guy Smorodinsky; Gal Vardi; Itay Safran

arXiv:2410.07632·cs.LG·February 11, 2025

Provable Privacy Attacks on Trained Shallow Neural Networks

Guy Smorodinsky, Gal Vardi, Itay Safran

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper demonstrates provable privacy vulnerabilities in trained 2-layer ReLU neural networks, including data reconstruction and membership inference, leveraging the implicit bias of these models.

Contribution

It introduces the first provable attacks on trained shallow neural networks based on their implicit bias, covering both data reconstruction and membership inference.

Findings

01

Reconstructs training data in univariate settings.

02

Identifies training points with high probability in high-dimensional data.

03

First to show provable vulnerabilities in implicit-bias-driven neural networks.

Abstract

We study what provable privacy attacks can be shown on trained, 2-layer ReLU neural networks. We explore two types of attacks; data reconstruction attacks, and membership inference attacks. We prove that theoretical results on the implicit bias of 2-layer neural networks can be used to provably reconstruct a set of which at least a constant fraction are training points in a univariate setting, and can also be used to identify with high probability whether a given point was used in the training set in a high dimensional setting. To the best of our knowledge, our work is the first to show provable vulnerabilities in this implicit-bias-driven setting.

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 3

Strengths

The overall question is interesting and would certainly be of interest to the community at large. The main contribution of this paper is to show there is _some_ theoretical basis for the literature on data reconstruction and membership inference attacks. However, the results in this paper are somewhat lacklustre for reasons outlined in the next two sections.

Weaknesses

The paper is lacking in the areas of exposition and technical details. Concepts which are central to the paper such as Gradient flow, Data Reconstruction, and Membership Inference are never formally defined, which makes the paper inaccessible to all but a very niche audience. I have listed a set of points in the next section for the authors to clarify. In the interest of transparency, I am clearly stating my position here: - The paper needs more work especially in exposition (see above). At t

Reviewer 02Rating 8Confidence 3

Strengths

1) As far as I am aware, the first general results of effective privacy attacks on neural networks, as opposed to empirical results known for specific datasets and models. 2) Surfacing assumptions required, which future work can weaken; the paper already discusses how the dimensionality assumptions may be weakened (given empirical evidence). 3) I believe the techniques in the proofs for the attacks to be novel to the private ML community

Weaknesses

1) I think the presentation could make it more explicit what the contributions of the theory is to the broader privacy community. I attribute this mainly to the use of vague language regarding the assumptions in the main text, such as in line 45 which states “certain settings with varying assumptions”. One could instead highlight specific assumptions (e.g., dimensionality for membership inference or the data-independence of the reconstruction attacks) and contrast them with what is already known

Reviewer 03Rating 5Confidence 4

Strengths

This paper studies an important problem of privacy in neural networks. Theoretical analysis are provided. Experiments are conducted to support the theoretical claims. Source codes are provided.

Weaknesses

My biggest concern is that the novelty seems to be limited. - It is well known [1] that overparameterized neural networks can memorize training input, and it is not surprising that the training input is encoded in the model weights in some way. Actually, empirical study has shown that it is possible to reconstruct training data from model weights [2] and even model updates [3]. From this perspective, the contribution of this work seems to be limited at providing a theoretical proof. However, th

Code & Models

Repositories

guy120494/Provable-Privacy-Attacks-on-Trained-Shallow-Neural-Networks
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Brain Tumor Detection and Classification · Privacy-Preserving Technologies in Data

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Sparse Evolutionary Training