TL;DR
This paper reveals that current verification methods for machine unlearning are vulnerable, demonstrating that model providers can deceive verification strategies while still retaining unlearned data.
Contribution
The paper introduces two novel adversarial unlearning processes that can bypass existing verification strategies, exposing safety vulnerabilities in machine unlearning verification methods.
Findings
Verification strategies are susceptible to circumvention by adversarial unlearning.
The authors propose two new methods to deceive verification strategies.
Empirical and theoretical validation confirms the effectiveness of these methods.
Abstract
As privacy concerns escalate in the realm of machine learning, data owners now have the option to utilize machine unlearning to remove their data from machine learning models, following recent legislation. To enhance transparency in machine unlearning and avoid potential dishonesty by model providers, various verification strategies have been proposed. These strategies enable data owners to ascertain whether their target data has been effectively unlearned from the model. However, our understanding of the safety issues of machine unlearning verification remains nascent. In this paper, we explore the novel research question of whether model providers can circumvent verification strategies while retaining the information of data supposedly unlearned. Our investigation leads to a pessimistic answer: \textit{the verification of machine unlearning is fragile}. Specifically, we categorize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
