Random Relabeling for Efficient Machine Unlearning

Junde Li; Swaroop Ghosh

arXiv:2305.12320·cs.LG·May 23, 2023·1 cites

Random Relabeling for Efficient Machine Unlearning

Junde Li, Swaroop Ghosh

PDF

Open Access

TL;DR

This paper introduces random relabeling, an efficient method for machine unlearning that supports data removal requests in supervised learning, reducing computational costs compared to retraining from scratch.

Contribution

The paper proposes a novel unlearning scheme called random relabeling applicable to generic supervised learning algorithms for efficient data removal.

Findings

01

Supports sequential data removal requests in online settings.

02

Provides a removal certification method based on probability distribution similarity.

03

Applicable to logit-based classifiers.

Abstract

Learning algorithms and data are the driving forces for machine learning to bring about tremendous transformation of industrial intelligence. However, individuals' right to retract their personal data and relevant data privacy regulations pose great challenges to machine learning: how to design an efficient mechanism to support certified data removals. Removal of previously seen data known as machine unlearning is challenging as these data points were implicitly memorized in training process of learning algorithms. Retraining remaining data from scratch straightforwardly serves such deletion requests, however, this naive method is not often computationally feasible. We propose the unlearning scheme random relabeling, which is applicable to generic supervised learning algorithms, to efficiently deal with sequential data removal requests in the online setting. A less constraining removal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Imbalanced Data Classification Techniques · Data Quality and Management