LogProber: Disentangling confidence from contamination in LLM responses
Nicolas Yax, Pierre-Yves Oudeyer, Stefano Palminteri

TL;DR
LogProber is a new algorithm designed to detect data contamination in large language models by assessing question familiarity, improving fairness in performance evaluation of these models.
Contribution
It introduces a novel, efficient contamination detection method focusing on question familiarity, addressing limitations of previous approaches in black box settings.
Findings
LogProber effectively detects contamination in LLMs.
It outperforms some existing methods in certain scenarios.
The method has limitations depending on contamination types.
Abstract
In machine learning, contamination refers to situations where testing data leak into the training set. The issue is particularly relevant for the evaluation of the performance of Large Language Models (LLMs), which are generally trained on gargantuan, and generally opaque, corpora of text scraped from the world wide web. Developing tools to detect contamination is therefore crucial to be able to fairly and properly track the evolution of the performance of LLMs. To date, only a few recent studies have attempted to address the issue of quantifying and detecting contamination in short text sequences, such as those commonly found in benchmarks. However, these methods have limitations that can sometimes render them impractical. In the present paper, we introduce LogProber, a novel, efficient algorithm that we show to be able to detect contamination in a black box setting that tries to…
Peer Reviews
Decision·Submitted to ICLR 2026
1. Solves the "Confidence" Flaw: The paper's main strength is identifying that existing detectors mistake a model's high confidence for contamination. Its novel solution is to analyze the question text instead of the answer, successfully disentangling genuine skill from memorization. 2. High Transparency: The authors are rigorous and transparent about the tool's limitations. They explicitly demonstrate that LogProber is blind to "answer-only" (-A) contamination, which is a common format for fi
1. The writing can be improved -- the introductory content is too long and the citation format can be further improved. Most importantly, the paper will benefit from adding a conclusion section and related work section. These two are clearly missing. 2. The Llama-1-7B model used in the experiment is too old. And the baselines are too few and not strong enough. It only compared with CDD, while data contamination detection is not a new topic and there are a lot of existing work defining and addre
The paper addresses a fundamental, high-impact problem. As models become more powerful, their performance on standard benchmarks is increasingly scrutinized for contamination . This work provides a practical tool to help maintain the integrity of LLM evaluation.
- The paper introduces a specific, non-trivial formula for the "Safe Score" based on the integral of the sorted cumulative log-probabilities (Equation 1). However, there is no justification provided for why this specific formulation is optimal, or even necessary, compared to simpler, more direct statistical measures of the "plateness" of the $log(p)$ curve. For instance, what about the simple variance of the $log(p)$ values? Or the 10th percentile of $log(p)$? A contaminated sequence should have
1. Important problem, data contamination is still a difficult and important problem to be solved. 2. The paper explained their key ideas very clearly
1. Lack of innovation, there have been a wide range of confidence/logP based-scores [1], and people have already figured that rephrasing would escape those detection methods [2, 3]. This method lacks merit in advancing the field. 2. Model / dataset used are too simple. Only one set of experiments are done (CRT / Llama-1) to show effectiveness. 3. Lack of analysis. Does question length play an effect here? What about CoT models? [1] Zhang, Huixuan, Yun Lin, and Xiaojun Wan. "Pacost: Paired con
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
