Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty, Suzan Verberne, Fatih Turkmen

TL;DR
This survey reviews the phenomenon of memorization in large language models, analyzing its causes, metrics, and mitigation strategies, and discusses future research directions to address privacy and security risks.
Contribution
It provides a comprehensive taxonomy of LLM memorization, evaluates current measurement methods, and explores mitigation strategies and future research topics.
Findings
Memorization poses significant privacy and security risks.
Various metrics and methods exist to quantify memorization.
Strategies to mitigate undesirable memorization are discussed.
Abstract
While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly concerning, posing significant ethical and legal challenges. At the heart of these vulnerabilities stands memorization, which refers to a model's tendency to store and reproduce phrases from its training data. This phenomenon has been shown to be a fundamental source to various privacy and security attacks against LLMs. In this paper, we provide a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability. Next, we discuss the metrics and methods used to quantify memorization, followed by an analysis of the causes and factors that contribute to memorization phenomenon. We then explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsDiffusion
