Forget Unlearning: Towards True Data-Deletion in Machine Learning

Rishav Chourasia; Neil Shah

arXiv:2210.08911·stat.ML·February 15, 2023·1 cites

Forget Unlearning: Towards True Data-Deletion in Machine Learning

Rishav Chourasia, Neil Shah

PDF

Open Access 1 Video

TL;DR

This paper critically examines the limitations of existing unlearning algorithms in machine learning, highlighting privacy vulnerabilities, and proposes a new, secure, and efficient data deletion method based on noisy gradient descent.

Contribution

It introduces a sound deletion guarantee, reveals privacy interdependencies among data records, and presents a novel unlearning algorithm that ensures privacy and efficiency.

Findings

01

Existing unlearning methods can leak deleted data over time.

02

Privacy of existing data is essential for protecting deleted data.

03

Proposed algorithm is accurate, efficient, and secure.

Abstract

Unlearning algorithms aim to remove deleted data's influence from trained models at a cost lower than full retraining. However, prior guarantees of unlearning in literature are flawed and don't protect the privacy of deleted records. We show that when users delete their data as a function of published models, records in a database become interdependent. So, even retraining a fresh model after deletion of a record doesn't ensure its privacy. Secondly, unlearning algorithms that cache partial computations to speed up the processing can leak deleted information over a series of releases, violating the privacy of deleted records in the long run. To address these, we propose a sound deletion guarantee and show that the privacy of existing records is necessary for the privacy of deleted records. Under this notion, we propose an accurate, computationally efficient, and secure machine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Forget Unlearning: Towards True Data-Deletion in Machine Learning· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings