Avoiding Copyright Infringement via Large Language Model Unlearning

Guangyao Dou; Zheyuan Liu; Qing Lyu; Kaize Ding; Eric Wong

arXiv:2406.10952·cs.CL·February 12, 2025·1 cites

Avoiding Copyright Infringement via Large Language Model Unlearning

Guangyao Dou, Zheyuan Liu, Qing Lyu, Kaize Ding, Eric Wong

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Stable Sequential Unlearning (SSU), a new framework for effectively removing copyrighted content from large language models over multiple time steps while preserving their general language capabilities.

Contribution

The paper proposes a novel SSU framework that enables sequential unlearning of copyrighted content in LLMs, addressing a practical and previously underexplored problem.

Findings

01

SSU outperforms existing baselines in unlearning efficacy.

02

SSU maintains the model's general-purpose knowledge.

03

Effective trade-off achieved between unlearning and language abilities.

Abstract

Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. In real-world scenarios, model owners need to continuously address copyright infringement as new requests for content removal emerge at different time points. This leads to the need for sequential unlearning, where copyrighted content is removed sequentially as new requests arise. Despite its practical relevance, sequential unlearning in the context of copyright infringement has not been rigorously explored in existing literature. To address this gap, we propose Stable Sequential Unlearning (SSU), a novel framework designed to unlearn copyrighted content from LLMs over multiple time steps. Our approach works by identifying and removing specific weight updates in the model's parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guangyaodou/SSU_Unlearn
pytorchOfficial

Videos

Avoiding Copyright Infringement via Large Language Model Unlearning· underline

Taxonomy

TopicsDigital Rights Management and Security