Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning

Shota Takashiro; Takeshi Kojima; Andrew Gambardella; Qi Cao; Yusuke Iwasawa; Yutaka Matsuo

arXiv:2410.00382·cs.CL·June 4, 2025

Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning

Shota Takashiro, Takeshi Kojima, Andrew Gambardella, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo

PDF

Open Access 1 Video

TL;DR

This paper introduces a method called in-context knowledge unlearning that allows large language models to selectively forget specific information at test time based on query context, improving privacy and security.

Contribution

It proposes a novel fine-tuning approach enabling LLMs to unlearn targeted knowledge dynamically, with detailed analysis of internal model behavior and layer-wise decision-making.

Findings

01

Achieves up to 95% forget accuracy

02

Retains 80% of unrelated knowledge

03

Outperforms existing baselines in various scenarios

Abstract

As large language models (LLMs) are applied across diverse domains, the ability to selectively unlearn specific information is becoming increasingly essential. For instance, LLMs are expected to selectively provide confidential information to authorized internal users, such as employees or trusted partners, while withholding it from external users, including the general public and unauthorized entities. Therefore, we propose a novel method termed ``in-context knowledge unlearning'', which enables the model to selectively forget information in test-time based on the query context. Our method fine-tunes pre-trained LLMs to enable prompt unlearning of target knowledge within the context, while preserving unrelated information. Experiments on TOFU, AGE and RWKU datasets using Llama2-7B/13B and Mistral-7B models demonstrate that our method achieves up to 95% forget accuracy while retaining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning· underline

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsTofu