# Do We Train on Test Data? Purging CIFAR of Near-Duplicates

**Authors:** Bj\"orn Barz, Joachim Denzler

arXiv: 1902.00423 · 2020-06-03

## TL;DR

This paper identifies duplicates between training and test sets in CIFAR datasets, creates a new fair test set by removing these duplicates, and shows that model accuracy drops significantly, indicating prior overfitting to memorized data.

## Contribution

The authors introduce the ciFAIR dataset with duplicate-free test sets for CIFAR, enabling more accurate evaluation of model generalization.

## Key findings

- Significant accuracy drop (9-14%) on duplicate-free test sets.
- Presence of duplicates biases previous performance evaluations.
- New dataset facilitates fairer benchmarking of image recognition models.

## Abstract

The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked datasets in computer vision and are often used to evaluate novel methods and model architectures in the field of deep learning. However, we find that 3.3% and 10% of the images from the test sets of these datasets have duplicates in the training set. These duplicates are easily recognizable by memorization and may, hence, bias the comparison of image recognition techniques regarding their generalization capability. To eliminate this bias, we provide the "fair CIFAR" (ciFAIR) dataset, where we replaced all duplicates in the test sets with new images sampled from the same domain. We then re-evaluate the classification performance of various popular state-of-the-art CNN architectures on these new test sets to investigate whether recent research has overfitted to memorizing data instead of learning abstract concepts. We find a significant drop in classification accuracy of between 9% and 14% relative to the original performance on the duplicate-free test set. The ciFAIR dataset and pre-trained models are available at https://cvjena.github.io/cifair/, where we also maintain a leaderboard.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.00423/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1902.00423/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1902.00423/full.md

---
Source: https://tomesphere.com/paper/1902.00423