AutoML Two-Sample Test

Jonas M. K\"ubler; Vincent Stimper; Simon Buchholz; Krikamol Muandet,; Bernhard Sch\"olkopf

arXiv:2206.08843·cs.LG·January 18, 2023

AutoML Two-Sample Test

Jonas M. K\"ubler, Vincent Stimper, Simon Buchholz, Krikamol Muandet,, Bernhard Sch\"olkopf

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper introduces an AutoML approach for two-sample testing that uses a simple mean discrepancy witness function, achieving competitive results without user input across various distribution shift scenarios.

Contribution

It proposes a novel AutoML framework for two-sample testing based on minimizing squared loss to optimize test power, simplifying application and improving performance.

Findings

01

Achieves competitive performance on distribution shift benchmarks

02

Uses a simple mean discrepancy witness function for testing

03

Provides an open-source Python implementation

Abstract

Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard supervised learning frameworks, whose usage can require specialized knowledge about two-sample testing. We use a simple test that takes the mean discrepancy of a witness function as the test statistic and prove that minimizing a squared loss leads to a witness with optimal testing power. This allows us to leverage recent advancements in AutoML. Without any user input about the problems at hand, and using the same method for all our experiments, our AutoML two-sample test achieves competitive performance on a diverse distribution shift benchmark as well as on challenging two-sample testing problems. We provide an implementation of the AutoML…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

AutoML Two-Sample Test· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Data Stream Mining Techniques

MethodsTest