Quantization-Robust LLM Unlearning via Low-Rank Adaptation

Jo\~ao Vitor Boer Abitante; Joana Meneguzzo Pasquali; Luan Fonseca Garcia; Ewerton de Oliveira; Thomas da Silva Paula; Rodrigo C. Barros; Lucas S. Kupssinsk\"u

arXiv:2602.13151·cs.LG·April 8, 2026

Quantization-Robust LLM Unlearning via Low-Rank Adaptation

Jo\~ao Vitor Boer Abitante, Joana Meneguzzo Pasquali, Luan Fonseca Garcia, Ewerton de Oliveira, Thomas da Silva Paula, Rodrigo C. Barros, Lucas S. Kupssinsk\"u

PDF

TL;DR

This paper introduces a quantization-robust unlearning method for LLMs using low-rank adaptation, enabling effective knowledge removal while maintaining utility after aggressive 4-bit quantization.

Contribution

The authors propose a novel LoRA-based approach that preserves unlearning updates post-quantization, improving utility and reducing privacy leakage in quantized models.

Findings

01

LoRA improves 4-bit utility by up to 7.93 points on Llama-2-7B.

02

LoRA reduces privacy leakage significantly under 4-bit PTQ.

03

LoRA maintains strong forgetting capabilities after quantization.

Abstract

Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induces parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.