Towards Irreversible Machine Unlearning for Diffusion Models
Xun Yuan, Zilong Zhao, Jiayu Li, Aryan Pasikhani, Prosanta Gope, Biplab Sikdar

TL;DR
This paper introduces a new attack called DiMRA that can reverse existing machine unlearning methods in diffusion models, and proposes DiMUM as a more robust unlearning approach that memorizes alternative data to prevent regeneration of unlearned content.
Contribution
The paper presents a novel attack on diffusion model unlearning methods and introduces DiMUM, a new unlearning technique that improves robustness by memorizing alternative data.
Findings
DiMRA effectively reverses finetuning-based unlearning methods.
DiMUM outperforms traditional unlearning methods in robustness.
Experimental results demonstrate the vulnerability and proposed solution's effectiveness.
Abstract
Diffusion models are renowned for their state-of-the-art performance in generating synthetic images. However, concerns related to safety, privacy, and copyright highlight the need for machine unlearning, which can make diffusion models forget specific training data and prevent the generation of sensitive or unwanted content. Current machine unlearning methods for diffusion models are primarily designed for conditional diffusion models and focus on unlearning specific data classes or features. Among these methods, finetuning-based machine unlearning methods are recognized for their efficiency and effectiveness, which update the parameters of pre-trained diffusion models by minimizing carefully designed loss functions. However, in this paper, we propose a novel attack named Diffusion Model Relearning Attack (DiMRA), which can reverse the finetuning-based machine unlearning methods, posing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
