On the Necessity of Output Distribution Reweighting for Effective Class Unlearning

Ali Ebrahimpour-Boroojeny; Yian Wang; and Hari Sundaram

arXiv:2506.20893·cs.LG·November 17, 2025

On the Necessity of Output Distribution Reweighting for Effective Class Unlearning

Ali Ebrahimpour-Boroojeny, Yian Wang, and Hari Sundaram

PDF

TL;DR

This paper identifies privacy risks in class unlearning due to class geometry oversight and proposes a novel reweighting method, TRW, that improves privacy and unlearning effectiveness across multiple datasets.

Contribution

The paper introduces Tilted ReWeighting (TRW), a new fine-tuning objective that mitigates privacy leakage by approximating retrained model distributions using class similarity estimates.

Findings

01

TRW reduces privacy leakage in class unlearning.

02

TRW outperforms existing methods on unlearning metrics.

03

TRW achieves significant improvements on CIFAR-10 benchmarks.

Abstract

In this paper, we reveal a significant shortcoming in class unlearning evaluations: overlooking the underlying class geometry can cause privacy leakage. We further propose a simple yet effective solution to mitigate this issue. We introduce a membership-inference attack via nearest neighbors (MIA-NN) that uses the probabilities the model assigns to neighboring classes to detect unlearned samples. Our experiments show that existing unlearning methods are vulnerable to MIA-NN across multiple datasets. We then propose a new fine-tuning objective that mitigates this privacy leakage by approximating, for forget-class inputs, the distribution over the remaining classes that a retrained-from-scratch model would produce. To construct this approximation, we estimate inter-class similarity and tilt the target model's distribution accordingly. The resulting Tilted ReWeighting (TRW) distribution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.