Detecting RLVR Training Data via Structural Convergence of Reasoning
Hongbo Zhang, Yue Yang, Jianhao Yan, Guangsheng Bao, Yue Zhang, Yue Zhang

TL;DR
This paper introduces a novel black-box detection method called Min-$k$NN Distance to identify RLVR training data by detecting the behavioral signature of reduced diversity in model generations, addressing benchmark contamination concerns.
Contribution
The paper proposes Min-$k$NN Distance, a simple, effective black-box detector that distinguishes RLVR training data from unseen data without requiring model access or token probabilities.
Findings
Min-$k$NN Distance reliably detects RLVR training data.
The method outperforms existing contamination detection baselines.
RLVR training induces a collapse in output diversity for seen prompts.
Abstract
Reinforcement learning with verifiable rewards (RLVR) is central to training modern reasoning models, but the undisclosed training data raises concerns about benchmark contamination. Unlike pretraining methods, which optimize models using token-level probabilities, RLVR fine-tunes models based on reward feedback from self-generated reasoning trajectories, making conventional likelihood-based detection methods less effective. We show that RLVR induces a distinctive behavioral signature: prompts encountered during RLVR training result in more rigid and similar generations, while unseen prompts retain greater diversity. We introduce Min-NN Distance, a simple black-box detector that quantifies this collapse by sampling multiple completions for a given prompt and computing the average of the smallest nearest-neighbor edit distances. Min-NN Distance requires no access to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics
