Causal Estimation of Memorisation Profiles
Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago, Pimentel

TL;DR
This paper introduces a new efficient method to estimate memorisation in language models, enabling analysis of how memorisation develops during training and its dependence on factors like model size, data order, and learning rate.
Contribution
It proposes a principled, computationally efficient approach to measure memorisation at the model instance level using a difference-in-differences design.
Findings
Larger models exhibit stronger and more persistent memorisation.
Memorisation is influenced by data order and learning rate.
Memorisation trends are stable across different model sizes.
Abstract
Understanding memorisation in language models has practical and societal implications, e.g., studying models' training dynamics or preventing copyright infringements. Prior work defines memorisation as the causal effect of training with an instance on the model's ability to predict that instance. This definition relies on a counterfactual: the ability to observe what would have happened had the model not seen that instance. Existing methods struggle to provide computationally efficient and accurate estimates of this counterfactual. Further, they often estimate memorisation for a model architecture rather than for a specific model instance. This paper fills an important gap in the literature, proposing a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics. Using this method, we characterise a model's memorisation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training · Pythia
