Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data
Minseo Kwak, Jaehyung Kim

TL;DR
Gap-K% introduces a new method for detecting pretraining data in large language models by analyzing the log probability gap between top-1 predictions and target tokens, improving accuracy over previous approaches.
Contribution
It proposes Gap-K%, a novel detection technique based on the model's optimization dynamics, incorporating local correlation and divergence measures for better pretraining data identification.
Findings
Achieves state-of-the-art results on WikiMIA and MIMIR benchmarks.
Outperforms prior methods across different model sizes and input lengths.
Effectively captures local token correlations and divergence signals.
Abstract
The opacity of massive pretraining corpora in Large Language Models (LLMs) raises significant privacy and copyright concerns, making pretraining data detection a critical challenge. Existing state-of-the-art methods typically rely on token likelihoods, yet they often overlook the divergence from the model's top-1 prediction and local correlation between adjacent tokens. In this work, we propose Gap-K%, a novel pretraining data detection method grounded in the optimization dynamics of LLM pretraining. By analyzing the next-token prediction objective, we observe that discrepancies between the model's top-1 prediction and the target token induce strong gradient signals, which are explicitly penalized during training. Motivated by this, Gap-K% leverages the log probability gap between the top-1 predicted token and the target token, incorporating a sliding window strategy to capture local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Authorship Attribution and Profiling · Computational and Text Analysis Methods
