A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish
Masahiro Kaneko, Timothy Baldwin

TL;DR
This paper surveys data leakage issues in large language models, examining how leakage rate impacts output and detection, and introduces a self-detection method that improves identification of leaked data.
Contribution
It provides an experimental analysis of leakage effects and proposes a novel self-detection approach using few-shot learning for better leak identification.
Findings
LLMs often generate leaked information despite low leakage rates.
Small amounts of leaked data significantly influence model outputs.
The proposed self-detection method outperforms existing detection techniques.
Abstract
Large Language Models (LLMs) are trained on massive web-crawled corpora. This poses risks of leakage, including personal information, copyrighted texts, and benchmark datasets. Such leakage leads to undermining human trust in AI due to potential unauthorized generation of content or overestimation of performance. We establish the following three criteria concerning the leakage issues: (1) leakage rate: the proportion of leaked data in training data, (2) output rate: the ease of generating leaked data, and (3) detection rate: the detection performance of leaked versus non-leaked data. Despite the leakage rate being the origin of data leakage issues, it is not understood how it affects the output rate and detection rate. In this paper, we conduct an experimental survey to elucidate the relationship between the leakage rate and both the output rate and detection rate for personal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
