Memories Retrieved from Many Paths: A Multi-Prefix Framework for Robust Detection of Training Data Leakage in Large Language Models

Trung Cuong Dang; David Mohaisen

arXiv:2511.20799·cs.CL·November 27, 2025

Memories Retrieved from Many Paths: A Multi-Prefix Framework for Robust Detection of Training Data Leakage in Large Language Models

Trung Cuong Dang, David Mohaisen

PDF

Open Access

TL;DR

This paper introduces a multi-prefix framework to detect training data leakage in large language models by measuring the diversity of retrieval paths, improving robustness over previous single-path methods.

Contribution

It proposes a novel multi-prefix memorization framework that better captures deep model memorization by analyzing multiple retrieval paths, enhancing data leakage detection.

Findings

01

Multi-prefix method reliably distinguishes memorized data from non-memorized.

02

The framework improves robustness in detecting data leakage in LLMs.

03

Experiments on open-source and aligned chat models validate effectiveness.

Abstract

Large language models, trained on massive corpora, are prone to verbatim memorization of training data, creating significant privacy and copyright risks. While previous works have proposed various definitions for memorization, many exhibit shortcomings in comprehensively capturing this phenomenon, especially in aligned models. To address this, we introduce a novel framework: multi-prefix memorization. Our core insight is that memorized sequences are deeply encoded and thus retrievable via a significantly larger number of distinct prefixes than non-memorized content. We formalize this by defining a sequence as memorized if an external adversarial search can identify a target count of distinct prefixes that elicit it. This framework shifts the focus from single-path extraction to quantifying the robustness of a memory, measured by the diversity of its retrieval paths. Through experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques