DurableUn: Quantization-Induced Recovery Attacks in Machine Unlearning
Abdullah Ahmad Khan, Ferdous Sohel

TL;DR
This paper demonstrates that INT4 quantization can systematically restore forgotten content in machine unlearning, revealing vulnerabilities and proposing a quantization-aware method for improved robustness.
Contribution
It introduces the first systematic study of unlearning robustness under INT4 quantization and proposes DURABLEUN-SAF, a new quantization-aware unlearning method.
Findings
INT4 quantization induces up to 22x recovery of forgotten content.
No existing method achieves strong forgetting, utility, and quantization robustness simultaneously.
DURABLEUN-SAF achieves a stable durability certificate under INT4 quantization.
Abstract
Machine unlearning aims to remove specified training data to satisfy privacy regulations such as GDPR. However, existing evaluations assume identical precision at unlearning and deployment, overlooking that production LLMs are deployed at low-bit precision. We show that INT4 quantization systematically restores forgotten content even when models pass compliance audits at bfloat16 (BF16), we term this the quantization recovery attack (QRA). We conduct the first systematic study of unlearning robustness under adapter-space INT4 quantization in the NF4+LoRA regime, evaluating seven methods on LLaMA-3-8B-Instruct across TOFU, MUSE-News, and WikiBio-WPU. INT8 is benign; INT4 induces recovery of up to 22x, worsening with dataset difficulty. We identify the FA-RA-Q-INT4 trilemma: no method simultaneously achieves strong forgetting, high utility, and quantization robustness. A dense Pareto…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
