Decoupling the Class Label and the Target Concept in Machine Unlearning

Jianing Zhu; Bo Han; Jiangchao Yao; Jianliang Xu; Gang Niu; Masashi; Sugiyama

arXiv:2406.08288·cs.LG·June 18, 2024

Decoupling the Class Label and the Target Concept in Machine Unlearning

Jianing Zhu, Bo Han, Jiangchao Yao, Jianliang Xu, Gang Niu, Masashi, Sugiyama

PDF

Open Access 3 Reviews

TL;DR

This paper introduces TARF, a novel framework for machine unlearning that decouples class labels from target concepts, enabling more precise forgetting of specific concepts beyond traditional class-wise methods.

Contribution

It proposes a new decoupling approach and a general framework, TARF, to improve targeted concept forgetting in machine unlearning tasks.

Findings

01

TARF effectively forgets target concepts while preserving other knowledge.

02

Decoupling class labels from target concepts addresses new unlearning challenges.

03

Empirical results demonstrate TARF's superiority over existing methods.

Abstract

Machine unlearning as an emerging research topic for data regulations, aims to adjust a trained model to approximate a retrained one that excludes a portion of training data. Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class, through gradient ascent on the forgetting data or fine-tuning with the remaining data. However, while these methods are useful, they are insufficient as the class label and the target concept are often considered to coincide. In this work, we decouple them by considering the label domain mismatch and investigate three problems beyond the conventional all matched forgetting, e.g., target mismatch, model mismatch, and data mismatch forgetting. We systematically analyze the new challenges in restrictively forgetting the target concept and also reveal crucial forgetting dynamics in the representation level to…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 10Confidence 2

Strengths

1. The paper presents a novel method that decouples the class label from the target concept, addressing practical scenarios where the forgetting target may not align with pre-training taxonomy. 2. authors provide theoretical insights that introducing the representation gravity. The solid analysis for understanding unlearning dynamics at the representation level is very important. 3. extensive experiments show consisent results with the proposed claims.

Weaknesses

TARF introduces multiple dynamic hyperparameters (e.g., k(t), τ(x,y,t), β, t₀, t₁), but their tuning procedure is only heuristically described. Ablation results help, but further discussion on computational stability and robustness would strengthen the paper.

Reviewer 02Rating 8Confidence 3

Strengths

1.Pioneering Problem Formulation: The paper introduces a novel, highly practical, and significant problem by decoupling the class label from the target concept in machine unlearning. This new perspective, formalized into three distinct mismatch scenarios, fundamentally expands the scope of the field and bridges a critical gap between academic research and real-world requirements for privacy, copyright, and AI safety. 2.Principled and General Framework (TARF): The proposed TARF framework is an el

Weaknesses

1.While the appendix contains a detailed ablation study for key hyperparameters (Figure 17), the discussion is currently presented as an empirical result. To improve practical adoption, the paper would be strengthened by synthesizing these findings into explicit guidelines or heuristics for practitioners. For example, the authors could provide a recommended strategy for setting k based on dataset characteristics or model capacity. 2.The core mechanism of TARF cleverly leverages "representation g

Reviewer 03Rating 6Confidence 3

Strengths

1. The manuscript introduces new settings that decouple the class label and the target concept, which investigate the label domain mismatch in class-wise unlearning. 2. The manuscript systematically reveals the challenges of restrictive unlearning with the mismatched label domains, and demonstrates that the "representation gravity" in forgetting dynamics is critical for achieving the forgetting target in the new tasks.

Weaknesses

1. Some of the figure legends in the manuscript obscure content. For example, Fig 5 and Fig 7. Additionally, certain legends are not fully explained, which affects readability. For example, the line legend in Fig 5(b). The readability of the tables is poor. It is recommended to also highlight the second-best values to improve readability in Table 2. 2. Compared with some simpler forgetting methods, the Target Identification stage in TARF requires additional computation to observe the “gravity ef

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Engineering Education and Curriculum Development · Online Learning and Analytics