Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion
Sonia Laguna, Jorge da Silva Goncalves, Moritz Vandenhirtz, Alain Ryser, Irene Cannistraci, Julia E. Vogt

TL;DR
This paper introduces a new model design that inherently supports forgetting specific data by deleting keys, enabling fast unlearning without retraining or data access, and outperforms existing post-hoc methods.
Contribution
The paper proposes unlearning by design with MUNKEY, a memory-augmented transformer that allows direct, zero-shot forgetting through key deletion, unlike traditional post-hoc approaches.
Findings
MUNKEY enables zero-shot forgetting without retraining.
It outperforms post-hoc baselines across multiple datasets.
Supports deployment-oriented unlearning with preserved accuracy.
Abstract
Machine unlearning is rapidly becoming a practical requirement, driven by privacy regulations, data errors, and the need to remove harmful or corrupted training samples. Despite this, most existing methods tackle the problem purely from a post-hoc perspective. They attempt to erase the influence of targeted training samples through parameter updates that typically require access to the full training data. This creates a mismatch with real deployment scenarios where unlearning requests can be anticipated, revealing a fundamental limitation of post-hoc approaches. We propose unlearning by design, a novel paradigm in which models are directly trained to support forgetting as an inherent capability. We instantiate this idea with Machine UNlearning via KEY deletion (MUNKEY), a memory augmented transformer that decouples instance-specific memorization from model weights. Here, unlearning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
