CURaTE: Continual Unlearning in Real Time with Ensured Preservation of LLM Knowledge
Seyun Bae, Seokhan Lee, Eunho Yang

TL;DR
CURaTE introduces a real-time continual unlearning method for large language models that effectively forget specific data while preserving overall knowledge, without modifying model parameters.
Contribution
It proposes a novel approach using sentence embeddings for immediate unlearning, outperforming existing methods in effectiveness and knowledge preservation.
Findings
CURaTE achieves more effective forgetting than existing methods.
It maintains near-perfect knowledge preservation over multiple updates.
It is the only method capable of real-time continual unlearning without parameter modification.
Abstract
The inability to filter out in advance all potentially problematic data from the pre-training of large language models has given rise to the need for methods for unlearning specific pieces of knowledge after training. Existing techniques overlook the need for continuous and immediate action, causing them to suffer from degraded utility as updates accumulate and protracted exposure of sensitive information. To address these issues, we propose Continual Unlearning in Real Time with Ensured Preservation of LLM Knowledge (CURaTE). Our method begins by training a sentence embedding model on a dataset designed to enable the formation of sharp decision boundaries for determining whether a given input prompt corresponds to any stored forget requests. The similarity of a given input to the forget requests is then used to determine whether to answer or return a refusal response. We show that even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
