QUAIL: Quantization Aware Unlearning for Mitigating Misinformation in LLMs
Himanshu Mishra, Kanwal Mehreen

TL;DR
This paper introduces a quantization-aware unlearning method that effectively preserves the removal of specific knowledge in large language models even after 4-bit quantization, addressing a critical challenge in deploying unlearned models.
Contribution
The paper analyzes how low-bit quantization affects unlearning and proposes a new logits space hinge loss to maintain forgetting after quantization.
Findings
Our method preserves forgetting under 4-bit quantization.
Existing unlearning methods nearly recover forgotten knowledge after quantization.
The approach is effective on language and classification tasks, including misinformation detection.
Abstract
Machine unlearning aims to remove specific knowledge (e.g., copyrighted or private data) from a trained model without full retraining. In practice, models are often quantized (e.g., 4-bit) for deployment, but we find that quantization can catastrophically restore forgotten information [1]. In this paper, we (1) analyze why low-bit quantization undermines unlearning, and (2) propose a quantization-aware unlearning method to mitigate this. We first compute weight-change statistics and bucket overlaps in quantization to show that typical unlearning updates are too small to cross quantization thresholds. Building on this insight, we introduce a logits space hinge loss: for each forget example, we force the output logits of the unlearned model to differ from the original model by at least a margin (half the quantization step). This ensures forgotten examples remain distinguishable even after…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Misinformation and Its Impacts
