Approximate Data Deletion in Generative Models

Zhifeng Kong; Scott Alfeld

arXiv:2206.14439·cs.LG·June 30, 2022

Approximate Data Deletion in Generative Models

Zhifeng Kong, Scott Alfeld

PDF

Open Access

TL;DR

This paper introduces a density-ratio-based framework for efficient approximate data deletion in generative models, addressing privacy concerns while reducing computational costs compared to full retraining.

Contribution

It proposes a novel framework and methods for approximate data deletion in generative models, with theoretical guarantees and empirical validation.

Findings

01

The proposed method achieves fast approximate data deletion.

02

A statistical test accurately estimates deletion success.

03

The approach works across various generative models.

Abstract

Users have the right to have their data deleted by third-party learned systems, as codified by recent legislation such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Such data deletion can be accomplished by full re-training, but this incurs a high computational cost for modern machine learning models. To avoid this cost, many approximate data deletion methods have been developed for supervised learning. Unsupervised learning, in contrast, remains largely an open problem when it comes to (approximate or exact) efficient data deletion. In this paper, we propose a density-ratio-based framework for generative models. Using this framework, we introduce a fast method for approximate data deletion and a statistical test for estimating whether or not training points have been deleted. We provide theoretical guarantees under various learner…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Machine Learning and Data Classification · Advanced Data Storage Technologies

MethodsTest