Reconstructing Training Data from Model Gradient, Provably
Zihan Wang, Jason D. Lee, Qi Lei

TL;DR
This paper demonstrates that it is possible to fully reconstruct training data from a single gradient query of a neural network, revealing significant privacy risks and providing a provable attack method.
Contribution
It proves data reconstructability from gradients under mild conditions and introduces an efficient tensor decomposition algorithm for the attack.
Findings
Training data can be reconstructed from a single gradient query.
The reconstruction method is both statistically and computationally efficient.
The results highlight severe privacy threats in federated learning.
Abstract
Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques
