LUNE: Efficient LLM Unlearning via LoRA Fine-Tuning with Negative Examples
Yezi Liu, Hanning Chen, Wenjun Huang, Yang Ni, Mohsen Imani

TL;DR
LUNE introduces a lightweight, efficient method for unlearning specific knowledge in large language models by fine-tuning only low-rank adapters with negative examples, significantly reducing computational costs.
Contribution
It proposes a novel LoRA-based unlearning framework that localizes knowledge removal, making unlearning more practical and resource-efficient.
Findings
Achieves comparable unlearning effectiveness to full fine-tuning.
Reduces computational cost by approximately tenfold.
Successfully applies to multiple factual unlearning tasks.
Abstract
Large language models (LLMs) possess vast knowledge acquired from extensive training corpora, but they often cannot remove specific pieces of information when needed, which makes it hard to handle privacy, bias mitigation, and knowledge correction. Traditional model unlearning approaches require computationally expensive fine-tuning or direct weight editing, making them impractical for real-world deployment. In this work, we introduce LoRA-based Unlearning with Negative Examples (LUNE), a lightweight framework that performs negative-only unlearning by updating only low-rank adapters while freezing the backbone, thereby localizing edits and avoiding disruptive global changes. Leveraging Low-Rank Adaptation (LoRA), LUNE targets intermediate representations to suppress (or replace) requested knowledge with an order-of-magnitude lower compute and memory than full fine-tuning or direct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education
