TL;DR
This paper introduces a new metric and framework for machine unlearning that better assesses and enhances the reliability of forgetting data, addressing the issue of fake forgetting.
Contribution
It proposes CR, a conformal prediction-based metric, and CPU, an unlearning framework integrating conformal prediction with adversarial attacks, to improve forgetting reliability.
Findings
CR provides a more reliable measure of forgetting quality.
CPU effectively removes ground truth labels from the conformal prediction set.
Experiments show improved forgetting performance and metric reliability.
Abstract
Machine unlearning seeks to remove the influence of specified data from a trained model. While the unlearning accuracy provides a widely used metric for assessing unlearning performance, it falls short in assessing the reliability of forgetting. In this paper, we find that the forgetting data points misclassified by unlearning accuracy still have their ground truth labels included in the conformal prediction set from the uncertainty quantification perspective, leading to a phenomenon we term fake forgetting. To address this issue, we propose a novel metric CR, inspired by conformal prediction, that offers a more reliable assessment of forgetting quality. Building on these insights, we further propose an unlearning framework CPU that incorporates conformal prediction into the Carlini & Wagner adversarial attack loss, enabling the ground truth label to be effectively removed from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
