TL;DR
This paper presents a method for selectively removing specific training data from a neural network's weights without retraining, ensuring the network's responses reveal no information about the forgotten data.
Contribution
It introduces a novel weight-scrubbing technique that preserves model performance while erasing information about particular training data, without requiring retraining or original data access.
Findings
The method effectively erases data-specific information from weights.
It does not degrade overall network performance.
The approach is efficient and applicable to deep neural networks.
Abstract
We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network. While the effects of the data to be forgotten can be hidden from the output of the network, insights may still be gleaned by probing deep into its weights. We propose a method for "scrubbing'" the weights clean of information about a particular set of training data. The method does not require retraining from scratch, nor access to the data originally used for training. Instead, the weights are modified so that any probing function of the weights is indistinguishable from the same function applied to the weights of a network trained without the data to be forgotten. This condition is a generalized and weaker form of Differential Privacy. Exploiting ideas related to the stability of stochastic gradient descent, we introduce an upper-bound on the amount of information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks· youtube
