LLM Surgery: Efficient Knowledge Unlearning and Editing in Large   Language Models

Akshaj Kumar Veldanda; Shi-Xiong Zhang; Anirban Das; Supriyo; Chakraborty; Stephen Rawls; Sambit Sahu; Milind Naphade

arXiv:2409.13054·cs.CL·September 23, 2024

LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models

Akshaj Kumar Veldanda, Shi-Xiong Zhang, Anirban Das, Supriyo, Chakraborty, Stephen Rawls, Sambit Sahu, Milind Naphade

PDF

Open Access

TL;DR

This paper introduces LLM Surgery, a framework for efficiently unlearning outdated or problematic knowledge in large language models while updating them with new information, without full retraining.

Contribution

It proposes a novel optimization framework combining unlearning, updating, and retention objectives, and provides a new dataset and benchmark for evaluation.

Findings

01

Achieves significant forgetting of problematic knowledge

02

Increases accuracy on updated information by 20%

03

Maintains performance on retained knowledge

Abstract

Large language models (LLMs) have revolutionized various domains, yet their utility comes with significant challenges related to outdated or problematic knowledge embedded during pretraining. This paper addresses the challenge of modifying LLMs to unlearn problematic and outdated information while efficiently integrating new knowledge without retraining from scratch. Here, we propose LLM Surgery, a framework to efficiently modify LLM behaviour by optimizing a three component objective function that: (1) Performs reverse gradient on unlearning dataset (problematic and outdated information), (2) Performs gradient descent on the update dataset (new and updated information), and (3) Minimizes the KL divergence on the retain dataset (small subset of unchanged text), ensuring alignment between pretrained and modified model outputs. Due to the lack of publicly available datasets specifically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Radiomics and Machine Learning in Medical Imaging