Demystifying Verbatim Memorization in Large Language Models

Jing Huang; Diyi Yang; Christopher Potts

arXiv:2407.17817·cs.CL·July 26, 2024·1 cites

Demystifying Verbatim Memorization in Large Language Models

Jing Huang, Diyi Yang, Christopher Potts

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how large language models memorize sequences verbatim, revealing that memorization depends on repetition, model checkpoints, and high-level features, and that current unlearning methods are ineffective without harming model performance.

Contribution

The study introduces a controlled framework for analyzing verbatim memorization in LLMs and provides new insights into its mechanisms and challenges for unlearning.

Findings

01

Verbatim memorization requires non-trivial repetition.

02

Later checkpoints are more prone to memorization.

03

Unlearning methods often degrade model performance.

Abstract

Large Language Models (LLMs) frequently memorize long sequences verbatim, often with serious legal and privacy implications. Much prior work has studied such verbatim memorization using observational data. To complement such work, we develop a framework to study verbatim memorization in a controlled setting by continuing pre-training from Pythia checkpoints with injected sequences. We find that (1) non-trivial amounts of repetition are necessary for verbatim memorization to happen; (2) later (and presumably better) checkpoints are more likely to verbatim memorize sequences, even for out-of-distribution sequences; (3) the generation of memorized sequences is triggered by distributed model states that encode high-level features and makes important use of general language modeling capabilities. Guided by these insights, we develop stress tests to evaluate unlearning methods and find they…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

explanare/verbatim-memorization
pytorchOfficial

Videos

Demystifying Verbatim Memorization in Large Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems

MethodsPythia