Reconstructing Training Data from Model Gradient, Provably

Zihan Wang; Jason D. Lee; Qi Lei

arXiv:2212.03714·cs.LG·June 13, 2023

Reconstructing Training Data from Model Gradient, Provably

Zihan Wang, Jason D. Lee, Qi Lei

PDF

Open Access

TL;DR

This paper demonstrates that it is possible to fully reconstruct training data from a single gradient query of a neural network, revealing significant privacy risks and providing a provable attack method.

Contribution

It proves data reconstructability from gradients under mild conditions and introduces an efficient tensor decomposition algorithm for the attack.

Findings

01

Training data can be reconstructed from a single gradient query.

02

The reconstruction method is both statistically and computationally efficient.

03

The results highlight severe privacy threats in federated learning.

Abstract

Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques