Surgical Knowledge Rewrite in Compact LLMs: An 'Unlearn-then-Learn' Strategy with ($IA^3$) for Localized Factual Modulation and Catastrophic Forgetting Mitigation

Stanley Ngugi

arXiv:2508.07075·cs.LG·August 12, 2025

Surgical Knowledge Rewrite in Compact LLMs: An 'Unlearn-then-Learn' Strategy with ($IA^3$) for Localized Factual Modulation and Catastrophic Forgetting Mitigation

Stanley Ngugi

PDF

Open Access

TL;DR

This paper presents a novel 'unlearn-then-learn' strategy using $IA^3$ for precise knowledge editing in compact LLMs, significantly reducing catastrophic forgetting and improving localization of factual updates.

Contribution

It introduces a two-stage, mechanistically informed approach combining circuit localization with PEFT to achieve accurate fact updates and mitigate forgetting in LLMs.

Findings

01

Achieves 98.50% accuracy in new fact modulation.

02

Suppresses original conflicting fact with 96.00% forget rate.

03

Dramatically improves localization accuracy to 72.00%.

Abstract

Large Language Models (LLMs) struggle with dynamic knowledge updates, especially when new information conflicts with deeply embedded facts. Such conflicting factual edits often lead to two critical issues: resistance to adopting the new fact and severe catastrophic forgetting of unrelated knowledge. This paper introduces and evaluates a novel "unlearn-then-learn" strategy for precise knowledge editing in LLMs, leveraging the parameter-efficient fine-tuning (PEFT) technique, Infused Adapter by Inhibiting and Amplifying Inner Activations ( $I A^{3}$ ). Crucially, this two-stage approach is powered by an initial circuit localization phase that identifies and targets the specific internal components responsible for encoding the conflicting fact. Through a rigorous experimental methodology on microsoft/Phi-3-mini-4k-instruct, we demonstrate that this mechanistically informed two-stage approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Multimodal Machine Learning Applications · Topic Modeling