Revisiting Classifier Two-Sample Tests
David Lopez-Paz, Maxime Oquab

TL;DR
This paper explores classifier-based two-sample tests (C2ST), demonstrating their theoretical properties, competitive performance, and novel applications in evaluating generative models and causal discovery.
Contribution
It provides a comprehensive analysis of C2ST, compares it with existing methods, and introduces new uses in evaluating generative models and causal inference.
Findings
C2ST learns data representations effectively.
C2ST has a simple null distribution and interpretable test statistics.
C2ST performs competitively against state-of-the-art two-sample tests.
Abstract
The goal of two-sample tests is to assess whether two samples, and , are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the examples in with a positive label, and by pairing the examples in with a negative label. If the null hypothesis "" is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where and differ. The goal of this paper is to establish the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning in Healthcare · Anomaly Detection Techniques and Applications
