QUAIL: Quantization Aware Unlearning for Mitigating Misinformation in LLMs

Himanshu Mishra; Kanwal Mehreen

arXiv:2601.15538·cs.LG·January 23, 2026

QUAIL: Quantization Aware Unlearning for Mitigating Misinformation in LLMs

Himanshu Mishra, Kanwal Mehreen

PDF

Open Access

TL;DR

This paper introduces a quantization-aware unlearning method that effectively preserves the removal of specific knowledge in large language models even after 4-bit quantization, addressing a critical challenge in deploying unlearned models.

Contribution

The paper analyzes how low-bit quantization affects unlearning and proposes a new logits space hinge loss to maintain forgetting after quantization.

Findings

01

Our method preserves forgetting under 4-bit quantization.

02

Existing unlearning methods nearly recover forgotten knowledge after quantization.

03

The approach is effective on language and classification tasks, including misinformation detection.

Abstract

Machine unlearning aims to remove specific knowledge (e.g., copyrighted or private data) from a trained model without full retraining. In practice, models are often quantized (e.g., 4-bit) for deployment, but we find that quantization can catastrophically restore forgotten information [1]. In this paper, we (1) analyze why low-bit quantization undermines unlearning, and (2) propose a quantization-aware unlearning method to mitigate this. We first compute weight-change statistics and bucket overlaps in quantization to show that typical unlearning updates are too small to cross quantization thresholds. Building on this insight, we introduce a logits space hinge loss: for each forget example, we force the output logits of the unlearned model to differ from the original model by at least a margin (half the quantization step). This ensures forgotten examples remain distinguishable even after…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Misinformation and Its Impacts