Proving Test Set Contamination in Black Box Language Models

Yonatan Oren; Nicole Meister; Niladri Chatterji; Faisal; Ladhak; Tatsunori B. Hashimoto

arXiv:2310.17623·cs.CL·November 27, 2023·5 cites

Proving Test Set Contamination in Black Box Language Models

Yonatan Oren, Nicole Meister, Niladri Chatterji, Faisal, Ladhak, Tatsunori B. Hashimoto

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to detect test set contamination in large language models by analyzing the likelihood of canonical versus shuffled benchmark orderings, without needing access to pretraining data or model weights.

Contribution

It presents a provable guarantee approach for contamination detection based on exchangeability and memorization patterns in language models.

Findings

01

Reliable detection of contamination in models as small as 1.4 billion parameters

02

Effective on small test sets of only 1000 examples

03

Little evidence of pervasive contamination in tested models

Abstract

Large language models are trained on vast amounts of internet data, prompting concerns and speculation that they have memorized public benchmarks. Going from speculation to proof of contamination is challenging, as the pretraining data used by proprietary models are often not publicly accessible. We show that it is possible to provide provable guarantees of test set contamination in language models without access to pretraining data or model weights. Our approach leverages the fact that when there is no data contamination, all orderings of an exchangeable benchmark should be equally likely. In contrast, the tendency for language models to memorize example order means that a contaminated language model will find certain canonical orderings to be much more likely than others. Our test flags potential contamination whenever the likelihood of a canonically ordered benchmark dataset is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tatsu-lab/test_set_contamination
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsSparse Evolutionary Training