Quantization-Robust LLM Unlearning via Low-Rank Adaptation
Jo\~ao Vitor Boer Abitante, Joana Meneguzzo Pasquali, Luan Fonseca Garcia, Ewerton de Oliveira, Thomas da Silva Paula, Rodrigo C. Barros, Lucas S. Kupssinsk\"u

TL;DR
This paper introduces a quantization-robust unlearning method for LLMs using low-rank adaptation, enabling effective knowledge removal while maintaining utility after aggressive 4-bit quantization.
Contribution
The authors propose a novel LoRA-based approach that preserves unlearning updates post-quantization, improving utility and reducing privacy leakage in quantized models.
Findings
LoRA improves 4-bit utility by up to 7.93 points on Llama-2-7B.
LoRA reduces privacy leakage significantly under 4-bit PTQ.
LoRA maintains strong forgetting capabilities after quantization.
Abstract
Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induces parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
