No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

Omer Sela (Tel Aviv University)

arXiv:2603.03203·cs.AI·March 12, 2026

No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

Omer Sela (Tel Aviv University)

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of output distribution-based contamination detection in small language models, revealing its limitations and superiority of probability-based methods like perplexity.

Contribution

It provides a comprehensive analysis of CDD's failure modes and demonstrates that probability-based methods outperform CDD in contamination detection for small models.

Findings

01

CDD often performs no better than chance in small models.

02

Perplexity and Min-k% Prob outperform CDD in contamination detection.

03

Fine-tuning does not reliably produce memorization detectable by CDD.

Abstract

CDD, or Contamination Detection via output Distribution, identifies data contamination by measuring the peakedness of a model's sampled outputs. We study the conditions under which this approach succeeds and fails on small language models ranging from 70M to 410M parameters. Using controlled contamination experiments on GSM8K, HumanEval, and MATH, we find that CDD's effectiveness depends critically on whether fine-tuning produces verbatim memorization. In the majority of conditions we test, CDD performs at chance level even when the data is verifiably contaminated and detectable by simpler methods. We show that probability-based methods, specifically perplexity and Min-k\% Prob, outperform CDD in all conditions where any method exceeds chance, suggesting that CDD's peakedness-based approach is insufficient for contamination detection in small language models. Our code is available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms