A Witness Two-Sample Test

Jonas M. K\"ubler; Wittawat Jitkrittum; Bernhard Sch\"olkopf; Krikamol; Muandet

arXiv:2102.05573·cs.LG·February 14, 2022

A Witness Two-Sample Test

Jonas M. K\"ubler, Wittawat Jitkrittum, Bernhard Sch\"olkopf, Krikamol, Muandet

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new two-sample test based on the Maximum Mean Discrepancy that uses training data to define weights and basis points, improving data efficiency and maintaining control over type-I error.

Contribution

It proposes a novel test that leverages training data for basis and weight selection, enhancing data efficiency and test power compared to existing MMD-based tests.

Findings

01

The new test is consistent with controlled type-I error.

02

It achieves comparable or better power than existing tests.

03

Empirical results on synthetic and real data validate its effectiveness.

Abstract

The Maximum Mean Discrepancy (MMD) has been the state-of-the-art nonparametric test for tackling the two-sample problem. Its statistic is given by the difference in expectations of the witness function, a real-valued function defined as a weighted sum of kernel evaluations on a set of basis points. Typically the kernel is optimized on a training set, and hypothesis testing is performed on a separate test set to avoid overfitting (i.e., control type-I error). That is, the test set is used to simultaneously estimate the expectations and define the basis points, while the training set only serves to select the kernel and is discarded. In this work, we propose to use the training data to also define the weights and the basis points for better data efficiency. We show that 1) the new test is consistent and has a well-controlled type-I error; 2) the optimal witness function is given by a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jmkuebler/wits-test
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning