Memorization or Interpolation ? Detecting LLM Memorization through Input Perturbation Analysis
Alb\'erick Euraste Djir\'e, Abdoul Kader Kabor\'e, Earl T. Barr,, Jacques Klein, Tegawend\'e F. Bissyand\'e

TL;DR
This paper presents PEARL, a new method to detect memorization in Large Language Models by analyzing their output sensitivity to input perturbations, helping distinguish memorized data from true generalization.
Contribution
PEARL offers a model-internal-agnostic framework for identifying memorization in LLMs through input perturbation analysis, advancing privacy and reliability assessments.
Findings
PEARL successfully detects memorization in GPT models.
It can identify memorized texts like Bible passages and code snippets.
The method provides evidence of training data inclusion, such as news articles.
Abstract
While Large Language Models (LLMs) achieve remarkable performance through training on massive datasets, they can exhibit concerning behaviors such as verbatim reproduction of training data rather than true generalization. This memorization phenomenon raises significant concerns about data privacy, intellectual property rights, and the reliability of model evaluations. This paper introduces PEARL, a novel approach for detecting memorization in LLMs. PEARL assesses how sensitive an LLM's performance is to input perturbations, enabling memorization detection without requiring access to the model's internals. We investigate how input perturbations affect the consistency of outputs, enabling us to distinguish between true generalization and memorization. Our findings, following extensive experiments on the Pythia open model, provide a robust framework for identifying when the model simply…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · Linear Layer · Multi-Head Attention · Dense Connections · Discriminative Fine-Tuning
