PRUNE: A Patching Based Repair Framework for Certifiable Unlearning of Neural Networks
Xuran Li, Jingyi Wang, Xiaohan Yuan, Peixin Zhang

TL;DR
This paper introduces PRUNE, a patching-based framework for certifiable unlearning in neural networks, enabling targeted data removal with guarantees and efficiency, addressing privacy and regulatory requirements.
Contribution
It proposes a novel neural network repair approach using minimal patches for certifiable unlearning, including strategies for unlearning large data subsets efficiently.
Findings
Effective unlearning with measurable guarantees.
Preserves model performance after unlearning.
Competitive efficiency and memory usage compared to baselines.
Abstract
It is often desirable to remove (a.k.a. unlearn) a specific part of the training data from a trained neural network model. A typical application scenario is to protect the data holder's right to be forgotten, which has been promoted by many recent regulation rules. Existing unlearning methods involve training alternative models with remaining data, which may be costly and challenging to verify from the data holder or a thirdparty auditor's perspective. In this work, we provide a new angle and propose a novel unlearning approach by imposing carefully crafted "patch" on the original neural network to achieve targeted "forgetting" of the requested data to delete. Specifically, inspired by the research line of neural network repair, we propose to strategically seek a lightweight minimum "patch" for unlearning a given data point with certifiable guarantee. Furthermore, to unlearn a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
