Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications
Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi, Das, Bishwamittra Ghosh, Krishna P. Gummadi, Evimaria Terzi

TL;DR
This paper introduces an experimental framework to analyze how large language models memorize data, revealing consistent dynamics, influencing factors, and implications for model reliability and privacy.
Contribution
The paper presents a novel framework for disentangling memorisation from other phenomena in LLMs and uncovers key factors affecting memorisation behavior.
Findings
Identifies phases of memorisation dynamics across models
Factors influencing memorisation ease are characterized
Sequential exposure impacts memorisation significantly
Abstract
Understanding whether and to what extent large language models (LLMs) have memorised training data has important implications for the reliability of their output and the privacy of their training data. In order to cleanly measure and disentangle memorisation from other phenomena (e.g. in-context learning), we create an experimental framework that is based on repeatedly exposing LLMs to random strings. Our framework allows us to better understand the dynamics, i.e., the behaviour of the model, when repeatedly exposing it to random strings. Using our framework, we make several striking observations: (a) we find consistent phases of the dynamics across families of models (Pythia, Phi and Llama2), (b) we identify factors that make some strings easier to memorise than others, and (c) we identify the role of local prefixes and global context in memorisation. We also show that sequential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTranslation Studies and Practices · Semantic Web and Ontologies · Natural Language Processing Techniques
