Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

Weidong Zheng; Kongyang Chen; Yuanwei Guo; Yatie Xiao

arXiv:2605.08730·cs.LG·May 12, 2026

Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

Weidong Zheng, Kongyang Chen, Yuanwei Guo, Yatie Xiao

PDF

1 Repo

TL;DR

This paper investigates class-level machine unlearning biases, diagnosing their causes, proposing mitigation techniques, and introducing new metrics, with experiments on standard datasets and code release.

Contribution

It reveals a bias-dominated shortcut in class-level unlearning, analyzes gradient dynamics, and proposes bias-aware mechanisms and metrics for improved unlearning.

Findings

01

Bias manipulation can satisfy unlearning metrics but leaves abnormal bias patterns.

02

Proposed methods maintain unlearning performance while stabilizing bias distributions.

03

Code is publicly available at the provided GitHub URL.

Abstract

Class-level machine unlearning aims to remove the influence of specified classes while preserving model utility on retained classes. Existing methods are commonly evaluated by retain-set accuracy, forget-set accuracy, and unlearning time, but these metrics provide limited insight into how forgetting is achieved internally. In this paper, we reveal a bias-dominated shortcut in class-level unlearning: the prediction of forgotten classes can be suppressed by decreasing the corresponding bias terms in the final classification head. We first analyze the gradient dynamics of classification-head biases under softmax cross-entropy training, explaining why retain-set-only optimization tends to reduce the biases of absent classes. Based on this observation, we introduce BiasShift as a diagnostic baseline, showing that simple bias manipulation can satisfy conventional unlearning metrics while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zwd2024/Beyond-the-Shadow-of-Bias-From-Classification-Head-Bias-to-Parameter-Redistribution
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.