PrivUn: Unveiling Latent Ripple Effects and Shallow Forgetting in Privacy Unlearning

Xiaoyi Chen; Haoyuan Wang; Siyuan Tang; Sijia Liu; Liya Su; XiaoFeng Wang; Haixu Tang

arXiv:2604.22076·cs.LG·April 27, 2026

PrivUn: Unveiling Latent Ripple Effects and Shallow Forgetting in Privacy Unlearning

Xiaoyi Chen, Haoyuan Wang, Siyuan Tang, Sijia Liu, Liya Su, XiaoFeng Wang, Haixu Tang

PDF

TL;DR

PrivUn introduces a comprehensive framework to evaluate privacy unlearning in large language models, revealing weaknesses like shallow forgetting and ripple effects, and proposing strategies for deeper, more effective unlearning.

Contribution

The paper systematically assesses unlearning robustness, uncovers gradient-driven ripple effects and shallow forgetting, and proposes new strategies for deep privacy unlearning.

Findings

01

Unlearning propagates through gradient-based associations, not just semantic relations.

02

Most methods fail to remove private info across multiple deep layers.

03

Proposed strategies improve deep forgetting by leveraging gradient similarity and representational constraints.

Abstract

Large language models (LLMs) often memorize private information during training, raising serious privacy concerns. While machine unlearning has emerged as a promising solution, its true effectiveness against privacy attacks remains unclear. To address this, we propose PrivUn, a new evaluation framework that systematically assesses unlearning robustness through three-tier attack scenarios: direct retrieval, in-context learning recovery, and fine-tuning restoration; combined with quantitative analysis using forgetting scores, association metrics, and forgetting depth assessment. Our study exposes significant weaknesses in current unlearning methods, revealing two key findings: 1) unlearning exhibits gradient-driven ripple effects: unlike traditional forgetting which follows semantic relations (e.g., knowledge graphs), privacy unlearning propagates across latent gradient-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.