The SVHN Dataset Is Deceptive for Probabilistic Generative Models Due to   a Distribution Mismatch

Tim Z. Xiao; Johannes Zenn; Robert Bamler

arXiv:2312.02168·cs.CV·December 7, 2023·1 cites

The SVHN Dataset Is Deceptive for Probabilistic Generative Models Due to a Distribution Mismatch

Tim Z. Xiao, Johannes Zenn, Robert Bamler

PDF

Open Access

TL;DR

This paper reveals that the SVHN dataset's train-test split is distributionally mismatched, which significantly impacts the evaluation of probabilistic generative models, and proposes a new split to address this issue.

Contribution

The authors identify a distribution mismatch in the SVHN dataset's official split and provide a new split to improve its reliability for generative modeling evaluation.

Findings

01

Distribution mismatch affects generative model evaluation

02

Official split is suitable only for classification tasks

03

Proposed new split improves evaluation consistency

Abstract

The Street View House Numbers (SVHN) dataset is a popular benchmark dataset in deep learning. Originally designed for digit classification tasks, the SVHN dataset has been widely used as a benchmark for various other tasks including generative modeling. However, with this work, we aim to warn the community about an issue of the SVHN dataset as a benchmark for generative modeling tasks: we discover that the official split into training set and test set of the SVHN dataset are not drawn from the same distribution. We empirically show that this distribution mismatch has little impact on the classification task (which may explain why this issue has not been detected before), but it severely affects the evaluation of probabilistic generative models, such as Variational Autoencoders and diffusion models. As a workaround, we propose to mix and re-split the official training and test set when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsSparse Evolutionary Training · Diffusion