TL;DR
This paper introduces Reload, a partially-blind unlearning framework that efficiently removes sensitive data influence from models without direct access to the forget set, enhancing privacy and security.
Contribution
It proposes a novel partially-blind unlearning method and a practical framework that outperform existing forget set-dependent approaches in efficiency and effectiveness.
Findings
Reload unlearns entities using less than 0.025% of retain set data.
It unlearns in under 8 minutes on Llama2-7B.
Reload remains effective even when only 10% of corrupted data is identified.
Abstract
Training machine learning models requires the storage of large datasets, which often contain sensitive or private data. Storing data is associated with a number of potential risks which increase over time, such as database breaches and malicious adversaries. Machine unlearning is the study of methods to efficiently remove the influence of training data subsets from previously-trained models. Existing unlearning methods typically require direct access to the "forget set" -- the data to be forgotten-and organisations must retain this data for unlearning rather than deleting it immediately upon request, increasing risks associated with the forget set. We introduce partially-blind unlearning -- utilizing auxiliary information to unlearn without explicit access to the forget set. We also propose a practical framework Reload, a partially-blind method based on gradient optimization and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
