Provable Privacy Attacks on Trained Shallow Neural Networks
Guy Smorodinsky, Gal Vardi, Itay Safran

TL;DR
This paper demonstrates provable privacy vulnerabilities in trained 2-layer ReLU neural networks, including data reconstruction and membership inference, leveraging the implicit bias of these models.
Contribution
It introduces the first provable attacks on trained shallow neural networks based on their implicit bias, covering both data reconstruction and membership inference.
Findings
Reconstructs training data in univariate settings.
Identifies training points with high probability in high-dimensional data.
First to show provable vulnerabilities in implicit-bias-driven neural networks.
Abstract
We study what provable privacy attacks can be shown on trained, 2-layer ReLU neural networks. We explore two types of attacks; data reconstruction attacks, and membership inference attacks. We prove that theoretical results on the implicit bias of 2-layer neural networks can be used to provably reconstruct a set of which at least a constant fraction are training points in a univariate setting, and can also be used to identify with high probability whether a given point was used in the training set in a high dimensional setting. To the best of our knowledge, our work is the first to show provable vulnerabilities in this implicit-bias-driven setting.
Peer Reviews
Decision·Submitted to ICLR 2025
The overall question is interesting and would certainly be of interest to the community at large. The main contribution of this paper is to show there is _some_ theoretical basis for the literature on data reconstruction and membership inference attacks. However, the results in this paper are somewhat lacklustre for reasons outlined in the next two sections.
The paper is lacking in the areas of exposition and technical details. Concepts which are central to the paper such as Gradient flow, Data Reconstruction, and Membership Inference are never formally defined, which makes the paper inaccessible to all but a very niche audience. I have listed a set of points in the next section for the authors to clarify. In the interest of transparency, I am clearly stating my position here: - The paper needs more work especially in exposition (see above). At t
1) As far as I am aware, the first general results of effective privacy attacks on neural networks, as opposed to empirical results known for specific datasets and models. 2) Surfacing assumptions required, which future work can weaken; the paper already discusses how the dimensionality assumptions may be weakened (given empirical evidence). 3) I believe the techniques in the proofs for the attacks to be novel to the private ML community
1) I think the presentation could make it more explicit what the contributions of the theory is to the broader privacy community. I attribute this mainly to the use of vague language regarding the assumptions in the main text, such as in line 45 which states “certain settings with varying assumptions”. One could instead highlight specific assumptions (e.g., dimensionality for membership inference or the data-independence of the reconstruction attacks) and contrast them with what is already known
This paper studies an important problem of privacy in neural networks. Theoretical analysis are provided. Experiments are conducted to support the theoretical claims. Source codes are provided.
My biggest concern is that the novelty seems to be limited. - It is well known [1] that overparameterized neural networks can memorize training input, and it is not surprising that the training input is encoded in the model weights in some way. Actually, empirical study has shown that it is possible to reconstruct training data from model weights [2] and even model updates [3]. From this perspective, the contribution of this work seems to be limited at providing a theoretical proof. However, th
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Brain Tumor Detection and Classification · Privacy-Preserving Technologies in Data
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Sparse Evolutionary Training
