Easy to Learn, Yet Hard to Forget: Towards Robust Unlearning Under Bias
JuneHyoung Kwon, MiHyeon Kim, Eunju Lee, Yoonji Lee, Seunghoon Lee, YoungBin Kim

TL;DR
This paper explores the challenges of machine unlearning in biased models, revealing a phenomenon called shortcut unlearning, and proposes a novel framework CUPID to improve unlearning effectiveness and mitigate bias retention.
Contribution
It introduces the concept of shortcut unlearning, analyzes its causes, and proposes CUPID, a new unlearning method that disentangles causal and bias pathways for better forgetting.
Findings
CUPID achieves state-of-the-art forgetting performance.
It effectively mitigates shortcut unlearning in biased models.
The method improves model reliability and privacy protection.
Abstract
Machine unlearning, which enables a model to forget specific data, is crucial for ensuring data privacy and model reliability. However, its effectiveness can be severely undermined in real-world scenarios where models learn unintended biases from spurious correlations within the data. This paper investigates the unique challenges of unlearning from such biased models. We identify a novel phenomenon we term ``shortcut unlearning," where models exhibit an ``easy to learn, yet hard to forget" tendency. Specifically, models struggle to forget easily-learned, bias-aligned samples; instead of forgetting the class attribute, they unlearn the bias attribute, which can paradoxically improve accuracy on the class intended to be forgotten. To address this, we propose CUPID, a new unlearning framework inspired by the observation that samples with different biases exhibit distinct loss landscape…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Privacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning
