Approximate Data Deletion in Generative Models
Zhifeng Kong, Scott Alfeld

TL;DR
This paper introduces a density-ratio-based framework for efficient approximate data deletion in generative models, addressing privacy concerns while reducing computational costs compared to full retraining.
Contribution
It proposes a novel framework and methods for approximate data deletion in generative models, with theoretical guarantees and empirical validation.
Findings
The proposed method achieves fast approximate data deletion.
A statistical test accurately estimates deletion success.
The approach works across various generative models.
Abstract
Users have the right to have their data deleted by third-party learned systems, as codified by recent legislation such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Such data deletion can be accomplished by full re-training, but this incurs a high computational cost for modern machine learning models. To avoid this cost, many approximate data deletion methods have been developed for supervised learning. Unsupervised learning, in contrast, remains largely an open problem when it comes to (approximate or exact) efficient data deletion. In this paper, we propose a density-ratio-based framework for generative models. Using this framework, we introduce a fast method for approximate data deletion and a statistical test for estimating whether or not training points have been deleted. We provide theoretical guarantees under various learner…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Machine Learning and Data Classification · Advanced Data Storage Technologies
MethodsTest
