Label Leakage Attacks in Machine Unlearning: A Parameter and Inversion-Based Approach
Weidong Zheng, Kongyang Chen, Yao Huang, Yuanwei Guo, Yatie Xiao

TL;DR
This paper investigates privacy risks in machine unlearning, proposing four novel attack methods to accurately identify and reconstruct data classes that models have forgotten, highlighting potential vulnerabilities.
Contribution
It introduces four new attack techniques based on model parameters and inversion to reveal forgotten data classes in unlearning scenarios.
Findings
Attacks successfully identify forgotten classes across multiple datasets.
Model inversion attacks can reconstruct class-prototypical samples.
Parameter-based clustering effectively infers unlearned data categories.
Abstract
With the widespread application of artificial intelligence technologies in face recognition and other fields, data privacy security issues have received extensive attention, especially the \textit{right to be forgotten} emphasized by numerous privacy protection laws. Existing technologies have proposed various unlearning methods, but they may inadvertently leak the categories of unlearned data. This paper focuses on the category unlearning scenario, analyzes the potential problems of category leakage of unlearned data in multiple scenarios, and proposes four attack methods from the perspectives of model parameters and model inversion based on attackers with different knowledge backgrounds. At the level of model parameters, we construct discriminative features by computing either dot products or vector differences between the parameters of the target model and those of auxiliary models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
