How Contaminated Is Your Benchmark? Quantifying Dataset Leakage in Large Language Models with Kernel Divergence

Hyeong Kyu Choi; Maxim Khanov; Hongxin Wei; Yixuan Li

arXiv:2502.00678·cs.LG·May 22, 2025

How Contaminated Is Your Benchmark? Quantifying Dataset Leakage in Large Language Models with Kernel Divergence

Hyeong Kyu Choi, Maxim Khanov, Hongxin Wei, Yixuan Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces Kernel Divergence Score (KDS), a new method to quantify dataset contamination in large language models by measuring divergence in kernel similarity matrices, ensuring more reliable evaluation of model generalization.

Contribution

The paper proposes KDS, a novel kernel-based metric for accurately measuring dataset contamination and distinguishing between seen and unseen samples in model evaluation.

Findings

01

KDS correlates strongly with contamination levels

02

KDS outperforms existing baseline methods

03

Ablation studies confirm the effectiveness of kernel-based features

Abstract

Dataset contamination, where evaluation datasets overlap with pre-training corpora, inflates performance metrics and undermines the reliability of model evaluations. Measuring dataset contamination thus becomes essential to ensure that performance evaluations genuinely reflect a model's ability to generalize to unseen data, rather than relying on memorized examples. To address this problem, we propose Kernel Divergence Score (KDS), a novel method that evaluates dataset contamination by computing the divergence between the kernel similarity matrix of sample embeddings, before and after fine-tuning on the benchmark dataset. Leveraging the insight that fine-tuning affects unseen samples more significantly than seen ones, KDS provides a reliable measure of contamination. Through extensive experiments on controlled contamination scenarios, KDS demonstrates a near-perfect correlation with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deeplearning-wisc/kernel-divergence-score
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling