RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning   Personal Information in Large Language Models

Bichen Wang; Yuzhe Zi; Yixin Sun; Yanyan Zhao; Bing Qin

arXiv:2406.01983·cs.CL·June 5, 2024

RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models

Bichen Wang, Yuzhe Zi, Yixin Sun, Yanyan Zhao, Bing Qin

PDF

Open Access

TL;DR

This paper introduces RKLD, a novel knowledge distillation method based on reverse KL-divergence, to improve unlearning of personal data in large language models while maintaining their utility.

Contribution

The paper proposes RKLD, a new unlearning algorithm specifically designed for large language models to better forget personal information without sacrificing model performance.

Findings

01

RKLD achieves superior forget quality compared to existing methods.

02

RKLD maintains model utility effectively after unlearning.

03

Experimental results validate RKLD's effectiveness in unlearning personal data.

Abstract

With the passage of the Right to Be Forgotten (RTBF) regulations and the scaling up of language model training datasets, research on model unlearning in large language models (LLMs) has become more crucial. Before the era of LLMs, machine unlearning research focused mainly on classification tasks in models with small parameters. In these tasks, the content to be forgotten or retained is clear and straightforward. However, as parameter sizes have grown and tasks have become more complex, balancing forget quality and model utility has become more challenging, especially in scenarios involving personal data instead of classification results. Existing methods based on gradient ascent and its variants often struggle with this balance, leading to unintended information loss or partial forgetting. To address this challenge, we propose RKLD, a novel \textbf{R}everse \textbf{KL}-Divergence-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Topic Modeling · Recommender Systems and Techniques