Sequence-Level Leakage Risk of Training Data in Large Language Models
Trishita Tiwari, G. Edward Suh

TL;DR
This paper introduces a sequence-level probability metric to better quantify training data leakage risks in large language models, revealing that previous metrics underestimated threats and that leakage dynamics vary with model size, prefix length, and decoding schemes.
Contribution
It proposes a new sequence-level leakage metric, providing more accurate risk assessment and insights into how different factors affect data extraction in LLMs.
Findings
Extraction Rate underestimates leakage risk by up to 2.14X.
Smaller models and shorter prefixes can be more vulnerable for certain sequences.
Partial leakage is not easier than verbatim data leakage in common decoding schemes.
Abstract
This work quantifies the risk of training data leakage from LLMs (Large Language Models) using sequence-level probabilities. Computing extraction probabilities for individual sequences provides finer-grained information than has been studied in prior benchmarking work. We re-analyze the effects of decoding schemes, model sizes, prefix lengths, partial sequence leakages, and token positions to uncover new insights that were not possible in previous works due to their choice of metrics. We perform this study on two pre-trained models, Llama and OPT, trained on the Common Crawl and The Pile respectively. We discover that 1) Extraction Rate, the predominant metric used in prior quantification work, underestimates the threat of leakage of training data in randomized LLMs by as much as 2.14X. 2) Although on average, larger models and longer prefixes can extract more data, this is not true for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsLLaMA · OPT
