Be like a Goldfish, Don't Memorize! Mitigating Memorization in   Generative LLMs

Abhimanyu Hans; Yuxin Wen; Neel Jain; John Kirchenbauer; Hamid Kazemi,; Prajwal Singhania; Siddharth Singh; Gowthami Somepalli; Jonas Geiping,; Abhinav Bhatele; Tom Goldstein

arXiv:2406.10209·cs.CL·November 5, 2024·3 cites

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi,, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping,, Abhinav Bhatele, Tom Goldstein

PDF

Open Access 1 Repo 7 Models 1 Datasets

TL;DR

This paper introduces the goldfish loss, a training modification for large language models that reduces memorization of training data by excluding random token subsets during training, thereby enhancing privacy without harming performance.

Contribution

It proposes a novel training objective, the goldfish loss, which effectively mitigates memorization in large language models while maintaining their utility.

Findings

01

Significant reduction in memorization of training data.

02

Minimal impact on downstream task performance.

03

Effective across billion-scale Llama-2 models.

Abstract

Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, randomly sampled subsets of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ahans30/goldfish-loss
pytorchOfficial

Models

Datasets

ahans1/wikipedia-en-2k-samples
dataset· 26 dl
26 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Wikis in Education and Collaboration · Topic Modeling