Adversarial Machine Unlearning
Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik, Yang Liu

TL;DR
This paper introduces a game-theoretic framework for machine unlearning that integrates membership inference attacks to effectively remove specific training data from models, enhancing privacy and unlearning efficiency.
Contribution
It proposes a novel adversarial, game-theoretic approach that incorporates MIAs into unlearning algorithm design using implicit differentiation for improved effectiveness.
Findings
The framework effectively unlearns specific data from models.
Empirical results show improved privacy preservation.
The approach leverages attack advancements for better unlearning.
Abstract
This paper focuses on the challenge of machine unlearning, aiming to remove the influence of specific training data on machine learning models. Traditionally, the development of unlearning algorithms runs parallel with that of membership inference attacks (MIA), a type of privacy threat to determine whether a data instance was used for training. However, the two strands are intimately connected: one can view machine unlearning through the lens of MIA success with respect to removed data. Recognizing this connection, we propose a game-theoretic framework that integrates MIAs into the design of unlearning algorithms. Specifically, we model the unlearning problem as a Stackelberg game in which an unlearner strives to unlearn specific training data from a model, while an auditor employs MIAs to detect the traces of the ostensibly removed data. Adopting this adversarial perspective allows…
Peer Reviews
Decision·ICLR 2025 Poster
1. The formulation of machine unlearning within a game-theoretic framework is interesting and offers a fresh perspective on the unlearning problem. 2. The paper employs several techniques to enable gradient computation in the Stackelberg game of machine unlearning.
1. The experiments conducted exclusively utilize the ResNet-18 model in image tasks, which may restrict the demonstration of the method's applicability across different architectures. Considering more complex models could provide a broader validation of the method's effectiveness and generalizability. 2. The optimization complexity is high, with a computational complexity of $O\left(n^3\right)$. It would be beneficial for the paper to explore potential techniques to reduce this complexity. 3.
1. The use of a game-theoretic approach to frame the machine unlearning problem is novel and provides a robust theoretical framework to tackle unlearning in an adversarial setting. 2. The paper successfully integrates complex mathematical tools like implicit differentiation and Stackelberg games, which are sophisticated and not commonly applied in standard unlearning approaches.
1. The complexity of the proposed solution, involving advanced mathematical constructs and game-theoretic elements, might pose challenges in terms of practical implementation and computational efficiency. 2. While the method shows effectiveness in controlled experiments, the scalability of this approach in larger, more heterogeneous datasets and in real-world applications is not thoroughly discussed. 3. The effectiveness of the unlearning process is heavily dependent on the assumption that the a
+ This paper is generally well-written and easy to understand. + The proposed method innovatively adopts the idea of adversarial training to solve the machine unlearning problem. Specifically, if MIAs cannot distinguish the unlearned data and the testing data from the unlearned model, then the data can be considered removed from the model. + Extensive experiments against various baselines and datasets validate the effectiveness of the proposed method.
- Efficacy. Although the authors have some theoretical analysis of the complexity of the proposed method, I am wondering how much faster the proposed method is against retraining. Involving lots of optimization, a direct comparison against retraining would be helpful. - Unlearning bias samples. What if unlearning samples are not iid from the training data? What’s the performance of the proposed method? - Sequence unlearning. It seems the proposed method only discusses one-time deletion. Can th
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
