Time Travel in LLMs: Tracing Data Contamination in Large Language Models

Shahriar Golchin; Mihai Surdeanu

arXiv:2308.08493·cs.CL·February 23, 2024·22 cites

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

Shahriar Golchin, Mihai Surdeanu

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces a novel method for detecting data contamination in large language models by analyzing instance and partition-level overlaps using guided instruction prompts and statistical measures, achieving high accuracy.

Contribution

The paper presents a new, effective approach for identifying data contamination in LLMs at both instance and partition levels, utilizing guided instruction prompts and statistical tests.

Findings

01

Achieves 92-100% accuracy in contamination detection

02

Identifies GPT-4 contamination with AG News, WNLI, and XSum datasets

03

Provides a scalable method for contamination assessment

Abstract

Data contamination, i.e., the presence of test data from downstream tasks in the training data of large language models (LLMs), is a potential major issue in measuring LLMs' real effectiveness on other tasks. We propose a straightforward yet effective method for identifying data contamination within LLMs. At its core, our approach starts by identifying potential contamination at the instance level; using this information, our approach then assesses wider contamination at the partition level. To estimate contamination of individual instances, we employ "guided instruction:" a prompt consisting of the dataset name, partition type, and the random-length initial segment of a reference instance, asking the LLM to complete it. An instance is flagged as contaminated if the LLM's output either exactly or nearly matches the latter segment of the reference. To understand if an entire partition is…

Peer Reviews

Decision·ICLR 2024 spotlight

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

The proposed method is straightforward and adaptable to a wide range of datasets.

Weaknesses

1. I have concerns regarding the soundness of the paper's evaluation methodology. The proposed method hinges on the assumption that a data instance is contaminated in an LLM if the LLM can complete the instance based on its prefix. The paper's evaluation primarily revolves around how well the proposed methods are compared to human experts under this assumption However, these concerns raise doubts about whether the underlying assumption holds for several reasons. (1) The inability of an LLM to co

Reviewer 02Rating 8· accept, good paperConfidence 2

Strengths

- Intuitive guided and general prompts to detect instance level contamination. - Approximating human expert classification for exact and approximate match using GPT-4 as a classifier, i.e. approximating semantic match. - Validation on a known contaminated LLM.

Weaknesses

- The authors rely on the algorithm to begin with when deciding what partitions were not leaked and should be added during fine-tuning. This has a circular dependence/assumption. (This point was addressed during discussion with the authors as a writing/explanation issue rather than a true circular dependence). - Different levels of data leakage is not considered. For example, would GPT-4 be detected as having seen paritions of datasets that follow well-known formats seen from other datasets if i

Reviewer 03Rating 8· accept, good paperConfidence 3

Strengths

Originality: The paper offers a fresh perspective on assessing the capabilities of LLMs in terms of potential dataset contamination. The methodologies introduced, especially the use of GPT-4's few-shot in-context learning, is innovative. Quality: The research appears thorough with detailed evaluations using two different algorithms. The results are well-tabulated, and the comparison with ChatGPT-Cheat offers a clearer understanding of the proposed methods' effectiveness. Clarity: The paper is st

Weaknesses

Scope: The paper focuses primarily on GPT-3.5 and GPT-4. A broader range of LLMs could provide more generalizable insights.

Code & Models

Repositories

shahriargolchin/time-travel-in-llms
noneOfficial

Videos

Time Travel in LLMs: Tracing Data Contamination in Large Language Models· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Layer Normalization · Softmax · Absolute Position Encodings · Residual Connection · Dense Connections · Dropout