On the limitation of evaluating machine unlearning using only a single training seed
Jamie Lanyon, Axel Finke, Petros Andreou, Georgina Cosma

TL;DR
This paper highlights the importance of considering training seed variability when evaluating machine unlearning methods, as results can be highly sensitive to seed choice, especially for deterministic algorithms.
Contribution
It reveals the seed sensitivity issue in MU evaluation and recommends including multiple training seeds for more reliable empirical comparisons.
Findings
MU methods can be highly seed-sensitive.
Deterministic MU algorithms show consistent seed dependence.
Evaluations should incorporate variability across training seeds.
Abstract
Machine unlearning (MU) aims to remove the influence of certain data points from a trained model without costly retraining. Most practical MU algorithms are only approximate and their performance can only be assessed empirically. Care must therefore be taken to make empirical comparisons as representative as possible. A common practice is to run the MU algorithm multiple times independently starting from the same trained model. In this work, we demonstrate that this practice can give highly non-representative results because -- even for the same architecture and same dataset -- some MU methods can be highly sensitive to the choice of random number seed used for model training. We illustrate that this is particularly relevant for MU methods that are deterministic, i.e., which always produce the same result when started from the same trained model. We therefore recommend that empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
