PrivUn: Unveiling Latent Ripple Effects and Shallow Forgetting in Privacy Unlearning
Xiaoyi Chen, Haoyuan Wang, Siyuan Tang, Sijia Liu, Liya Su, XiaoFeng Wang, Haixu Tang

TL;DR
PrivUn introduces a comprehensive framework to evaluate privacy unlearning in large language models, revealing weaknesses like shallow forgetting and ripple effects, and proposing strategies for deeper, more effective unlearning.
Contribution
The paper systematically assesses unlearning robustness, uncovers gradient-driven ripple effects and shallow forgetting, and proposes new strategies for deep privacy unlearning.
Findings
Unlearning propagates through gradient-based associations, not just semantic relations.
Most methods fail to remove private info across multiple deep layers.
Proposed strategies improve deep forgetting by leveraging gradient similarity and representational constraints.
Abstract
Large language models (LLMs) often memorize private information during training, raising serious privacy concerns. While machine unlearning has emerged as a promising solution, its true effectiveness against privacy attacks remains unclear. To address this, we propose PrivUn, a new evaluation framework that systematically assesses unlearning robustness through three-tier attack scenarios: direct retrieval, in-context learning recovery, and fine-tuning restoration; combined with quantitative analysis using forgetting scores, association metrics, and forgetting depth assessment. Our study exposes significant weaknesses in current unlearning methods, revealing two key findings: 1) unlearning exhibits gradient-driven ripple effects: unlike traditional forgetting which follows semantic relations (e.g., knowledge graphs), privacy unlearning propagates across latent gradient-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
